When you write a scientific paper, you must refer to any figure or table by number in the text. Moreover, you often include multiple tables, and sometimes you have to move those tables around during the revising process. Hard coding the numbers of the tables and figures into the text is an exercise in frustration, since if you change the order of the tables, you have to then go back and change all of the references to that table.
Instead, we use references to automatically refer to figures and tables by number. Once you learn how to references tables and figures automatically, you will never go back to referencing them manually again!
Lab Goals
The purpose of this lab is to learn how to integrate figures and tables into the narrative of the document using references.
After completing this lab, you should be able to create figures from static images or data graphics that you create and reference them by number automatically in the text. Similarly, you should be able to create tables from data or by hand and reference them automatically.
You will know that you are done when you have created two tables and two figures with captions and referenced them in the text.
Optional Reading
Quarto’s official documentation of it’s cross-referencing is a great page to bookmark for future reference, and to learn about more advanced features not covered in this lab.
Start by creating a new .qmd document in the source editor by unchecking the “Use visual markdown editor” button. This document is specific to practice for this lab, and should be separate from your project rotation documents.
In this lab, we will use two functions from the knitr package, which you already have installed. Let’s begin by loading the tidyverse and knitr packages.
Question
Use library() to load the tidyverse and knitr packages.
Table from scratch
There is a syntax for making tables by hand in Markdown. However, since you are using R anyway, and R has powerful functions for manipulating data, it’s almost always a better option to create tables by constructing a data frame in R that holds the data you want to display in the table.
For example, suppose that want to eat once piece of fruit every weekday. However, you want to randomize which fruit you eat each day.
To do this, we first create a tibble called fruits with two columns that shows the fruits in alphabetical order.
Question
Use the tibble() function to create the fruits data frame. Display the fruits data frame.
Now that we have this data structure in R, we can modify it. In this case, we want to randomly select the order of fruit for subsequent weeks. We can do this using the mutate() function you learned about in Lab 4 and the sample() function.
Question
Use the mutate() function to add a third column to the fruits data frame that contains a random sample() of the week1 column. Call the resulting column week2.
Displaying the table
Next, we want to display the contents of the fruits data frame as a table. We do this using the kable() function from the knitr package. Render the document to see the difference between just printing fruits and using the kable() function.
Use the kable() function to display the fruits data frame as a table.
weekday
week1
week2
Monday
apple
apple
Tuesday
banana
grapes
Wednesday
grapes
pear
Thursday
orange
banana
Friday
pear
orange
Referencing a table
To reference a table, we need three parts:
The chunk that creates the table (uses the kable() function) has to have a label, and the label has to start with tbl-, e.g. tbl-blah.
The chunk that creates the table has to have a caption created by the tbl-cap chunk option.
The text needs to refer to @tbl-blah. The link below that says “Table 1” comes from using the @ notation in that sentence.
For example, see @tbl-example below:
#| label: tbl-example
#| tbl-cap: "This is a table we are referencing in the text."
kable(mtcars)
To see this in action, consider Table 1, which shows the fruits.
Code
kable(fruits)
Table 1: These are the first two weeks for our fruits.
weekday
week1
week2
Monday
apple
apple
Tuesday
banana
grapes
Wednesday
grapes
pear
Thursday
orange
banana
Friday
pear
orange
Question
Repeat the previous exercise to create fruits for weeks 3 and 4. Refer to the resulting table in the text by reference. Render your document to see the results.
The kable() function can help you create a table from any data frame.
In this section, we’ll illustrate two different techniques for producing commonly-used univariate summaries. This will build on material from Lab 5.
Data summaries from scratch
If you want total control of your data summary, build it from scratch using the dplyr functions you learned about in Lab 4 and Lab 5. This may be time-consuming, but it gives you full control.
See Table 3 for an example of a summary table constructed from scratch.
Use the na.rm argument to remove the missing data from the data summary shown in Table 3. See help(mean)
Question
Use the digits argument to the kable() function to round the numbers in Table 3 to one decimal place.
Data summaries from another package
There are a number of packages that provide functions for displaying data summaries.
For example, Table 4 shows the distribution of height and mass as displayed by the skim() function from the skimr package. Note that we include the skimr_include_summary: false chunk option in order to suppress some extraneous output.
Choose one of the approaches above to create a summary data table of the penguins data frame in the palmerpenguins package you learned about in Lab 3 and reference it in the text.
Figure from data
In Quarto, figures are embedded and referenced in a similar manner as tables.
You already know how to create figures from data in a Quarto document, having learned how to construct them in Lab 3 and how to polish them in Lab 6. To reference a figure in a Quarto document, we need the same things that we needed to reference a table:
The chunk that creates the figure has to have a label, and the label has to start with fig-, e.g. fig-blah.
The chunk that creates the figure has to have a caption created by the fig-cap chunk option.
This works great, but it doesn’t create a figure that can be referenced automatically. To create a figure that we can reference, we use the include_graphics() function from the knitr package. Its only required argument is a path to the image you want to embed. The path can be either a URL or a path to an image file on your computer.
You will still need to set the label and fig-cap chunk options in order to cross-reference this image.
Find an image on the Internet, embed it in your Quarto document using include_graphics(), and reference it in the text.
Next Steps
Step 1: Complete the required reading
Lab 11 has a set of pre-readings to complete to facilitate a fruitful discussion surrounding ethical codes for data science. To lighten the workload, we will divide up the readings among your final project group as follows:
Everyone reads the following two (very short) readings:
The following 3 readings should be divided among your group. Each person will do 1 of the following readings, and present the key points during class.
Teaching Ethics in a Statistics Curriculum with a Cross-Cultural Emphasis (Elliott, Stokes, Cao). This reading is available as a PDF on Moodle. Elliott, Stokes, and Cao (2018)
During this week, you will complete Phase 3 of the group project. See the project page for details about what is expected during this phase of the project.
Elliott, Alan C, S Lynne Stokes, and Jing Cao. 2018. “Teaching Ethics in a Statistics Curriculum with a Cross-Cultural Emphasis.”The American Statistician 72 (4): 359–67. https://doi.org/10.1080/00031305.2017.1307140.