Lab 10: Referencing Figures and Tables

Why are we here?

In scientific writing, figures and tables are always references by number in your text. The location and number of these figures and tables are likely to change as your articles are edited and revised. This means that Hard coding the numbers of the tables and figures into your writing is an exercise in frustration, since you have to then go back and change all of the references to that table.

Instead, we use cross-references to automatically refer to figures and tables by number, and avoid the error-prone process of manually numbering each one. Once you learn how to references tables and figures automatically, you will never go back to referencing them manually again!

Lab Goals

The purpose of this lab is to learn how to integrate figures and tables into the narrative of the document using references.

After completing this lab, you should be able to create figures from images you embed or from data graphics that you create, and reference them by number automatically in the text. Similarly, you should be able to create formatted tables from data frames and reference them automatically.

Optional Reading

Quarto’s official documentation of it’s cross-referencing is a great page to bookmark for future reference, and to learn about more advanced features not covered in this lab.

Quarto also supports many advanced features for arranging and positioning images.

Lab instructions

Setting up

There is no Quarto template for this week’s lab. Instead, you should begin by creating a new Quarto document, and saving it in your SDS 100 project folder. If you’re unsure about how to do this, look back the Setting Up section of the “Formatting in Quarto” lab.

Formatting data frames as tables

Consider the R code below, which creates and prints out a small data frame of Pokemon and their types:1 2

Code
library(tidyverse)

gen1_starters <- tibble(
  name = c("Bulbasaur", "Squirtle", "Charmander", "Pikachu"),
  type1 = c("Grass", "Water", "Fire", "Electric"),
  type2 = c("Poison", NA, NA, NA)
)

print(gen1_starters)
# A tibble: 4 × 3
  name       type1    type2 
  <chr>      <chr>    <chr> 
1 Bulbasaur  Grass    Poison
2 Squirtle   Water    <NA>  
3 Charmander Fire     <NA>  
4 Pikachu    Electric <NA>  

The visual appearance of this table is rather lackluster. Despite occupying a medium rich with visual expression (a web page), it looks nearly identical to how it would appear in the plain-text R console. And, the output has lots of information that most readers would never care about (e.g., telling you that the object is a “tibble”, and that each columns contains “character” data).

But, what if I told you this table could look like a proper table from the modern era, instead of plain text from the 1980s? And that all it would take is one R function? You’d be pretty excited right?

In that case, meet your new best friend, the kable() function from the knitr package (short for knitr table). If you use the kable() function instead of print(), your data frame will be displayed as a formatted table instead of plain text. Consider Table 1 and Table 2, both of which are created by using the kable() function to display the gen1_starters data frame as a formatted table:

Code
library(knitr)


kable(gen1_starters)
Table 1: kable() creat formatted tables from data frames
name type1 type2
Bulbasaur Grass Poison
Squirtle Water NA
Charmander Fire NA
Pikachu Electric NA
Code
library(knitr)

gen1_starters |>
  kable()
Table 2: kable() can be used at the end of pipelines
name type1 type2
Bulbasaur Grass Poison
Squirtle Water NA
Charmander Fire NA
Pikachu Electric NA

Use the examples above, as well your general data wrangling knowledge, to help you complete Exercise 1 and 2.

Question

Imagine that you are the CEO of a large multinational corporation, and that each week you receive update presentations from the major departments in your corporation. You wish to write an email to the head of each department, letting them know the order of the presentations for next week’s meeting, and you want to include the following table in your email:

department week1_order
Accounting 1
Human Resources 2
Marketing 3
Operations 4

Recreate this table as a data frame in , and then display this data frame as a formatted table.

Don’t forget about loading the packages you need!

Question

As shown in the table above, the department heads are scheduled to give their presentations in alphabetical order by department name.

Create a random presentation schedule for the next two weeks by using the sample() function to shuffle the values in the week1_order column. Be sure to create a new column in the data frame for each new presentation order.

When you’re finished creating the presentation schedules, be sure to display this updated data frame as a formatted table

Cross-referencing a formatted table

In the last section, we called your attention to Table 1 and Table 2 using cross-referencing (and we just did it again in this sentence!). Creating cross-references to a table in a Quarto document requires three components:

  1. A Chunk Label: The code chunk that creates the table with the kable() function needs to be named using the #| label: chunk option, and the label must start with the prefix tbl- (e.g., #| label: tbl-my-cute-table)
  2. A Table Caption: The code chunk that creates the table with the kable() function also needs to use the #| tbl-cap: chunk option to give the table a caption.
  3. An @ mention: Somewhere in your writing, @ mention the table using the label you’ve given to the code chunk, e.g. @tbl-my-cute-table

The example below shows how you to put all 3 components together. The first tab shows you the content you would write in the “raw” Quarto document, and the second column shows you what the output would look like after rendering the document:

See @tbl-pokemon-starters for an overview of the starter Pokémon from Generation 1.


```{r}
#| label: tbl-pokemon-starters
#| tbl-cap: "Starter Pokémon from Generation 1 and their types."
#| echo: false
kable(gen1_starters)
```

See Table 3 for an overview of the starter Pokémon from Generation 1.

Table 3: Starter Pokémon from Generation 1 and their types.
name type1 type2
Bulbasaur Grass Poison
Squirtle Water NA
Charmander Fire NA
Pikachu Electric NA

Do note that in addition to the #| label: and #| tbl-cap: chunk options, we also used the #| echo: chunk option so hide the code in the rendered output. This allows the reader to focus more on the writing and results, rather than focusing the code that created the table.

Question

The code below counts the number of four, six, and eight cylinder cars represented in the mtcars data set.

mtcars <- datasets::mtcars

cyl_summary <- mtcars |>
  group_by(cyl) |>
  summarize(n = n())

cyl_summary
# A tibble: 3 × 2
    cyl     n
  <dbl> <int>
1     4    11
2     6     7
3     8    14

Augment this summary table by adding a new column that measures the percentage of four, six, and eight cylinder cars, and by formatting the final table using the kable() function.

Be sure that your summary table uses reader-friendly names (e.g., “Cylinder” instead of “cyl”), and that all values are rounded to two decimal points. There are many ways to do this, but kable() already includes such functionality. Read the help page for the kable() function to find out the best way to do this. Don’t forget that you can open the help page by running ?kable in the R console.

Lastly, write a sentence describing what the reader can learn from this table, and cross-reference this table in your writing.

Cross-referencing figures

In Quarto, figures and images can be referenced in a similar manner as tables.

You already know how to create figures from data, having learned how to create data graphics using ggplot2 in Lab 3 and how to get them “publication ready” in Lab 6. To cross-reference a data graphic in a Quarto document, we need the same three components that we needed to reference a table:

  1. A Chunk Label: Just like referencing a table, the code chunk that creates the figure needs to be named using the #| label: chunk option. But for a figure, the label must start with the prefix fig (e.g., #| label: fig-my-cool-figure)
  2. A Figure Caption: The code chunk that creates the figure also needs to use the #| fig-cap: chunk option to give the table a caption.
  3. An @ mention: Somewhere in your writing, @ mention the figure using the label you’ve given to the code chunk, e.g. @fig-my-cool-figure
Question

Create a scatterplot visualizing the relationship between the cyl and mpg variables in the mtcars data set. Write a sentence summarizing what can be learned from this figure, and cross-reference this figure in your writing.

Cross-referencing images

In Lab 8, you learned how to embed an image in a Quarto document using Markdown. Images, such as this image of the Smith SDS logo, can be embedded with Markdown syntax like this

![](https://pbs.twimg.com/profile_images/1048189234904010753/RUL5NyvY_400x400.jpg)

To turn this image into something that can be cross-referenced, you need to add:

  1. A caption by writing text inside the [] square brackets that are currently empty.
  2. Adding a label for the image. You can add an identifier by adding {} brackets immediately after the closing parenthesis, and writing a name that starts with #fig-

For example, here’s how you could cross-reference the image of the Smith SDS logo You can also cross-references images you embed in a similar fashion.

@fig-sds displays the Smith SDS logo

![Behold, the beautiful Smith SDS Logo](https://pbs.twimg.com/profile_images/1048189234904010753/RUL5NyvY_400x400.jpg){#fig-sds}

Figure 1 displays the Smith SDS logo

Figure 1: Behold, the beautiful Smith SDS Logo
Question

Embed and cross-reference this animation of a horse’s gallop3.

Next Steps

Step 1: Compare solutions

Double check that your Quarto document has all the necessary content and formatting elements by comparing your rendered output against our example solutions.

Step 2: Complete the required reading

Lab 11 has a set of pre-readings to complete to facilitate a fruitful discussion surrounding ethical codes for data science. To lighten the workload, we will divide up the readings among your final project group as follows:

Everyone reads the following two (very short) readings:

The following 3 readings should be divided among your group. Each person will do 1 of the following readings, and present the key points during class.

Step 3: Complete the Moodle Quiz

Complete the Moodle quiz for this lab.

Step 4: Final Project Phase 3

During this week, you will complete Phase 3 of the group project. See the project page for details about what is expected during this phase of the project.

References

Drum, Kevin. 2013. “It’s the Austerity, Stupid: How We Were Sold an Economy-Killing Lie.” Mother Jones. https://www.motherjones.com/politics/2013/09/austerity-reinhart-rogoff-stimulus-debt-ceiling/.
Elliott, Alan C, S Lynne Stokes, and Jing Cao. 2018. “Teaching Ethics in a Statistics Curriculum with a Cross-Cultural Emphasis.” The American Statistician 72 (4): 359–67. https://doi.org/10.1080/00031305.2017.1307140.
“Fact Sheet: President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence.” 2023. The White House. https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/.

Footnotes

  1. Yes, Pikachu is a gen 1 starter. Don’t forget about Pokémon Yellow!↩︎

  2. It took another 22 years and six more generations of games before another starter Pokémon with a dual typing was introduced, with Rowlet and its grass/flying typing breaking the drought in Generation 7.↩︎

  3. “The Horse In Motion” is the earliest known motion picture based on real photographic images, and the history behind this image is quite interesting!↩︎