Lab 1: Getting Started

Why are we here?

The practice of science is suffering through a reproducibility crisis. The scientific community is realizing that many “findings”—even many that are published in reputable journals after careful peer-review—do not stand up under the scrutiny of replication. The reasons behind this are complex and wide-ranging.

Whatever your interests are, we want your work to be rigorous and replicable. This means that the data you analyze, the models you fit, and the tables and figures you produce, need to be reproducible.

One of the ways that researchers can increase the transparency and reproducibility is by using software that allows them to remove manual steps in the “data science pipeline”. For example, instead of creating a table of summaries in Excel, then copy/pasting the table into a Word document for reporting, a researcher could use software that creates the table of summaries and embeds that table into a report.

Quarto is one such piece of software. Quarto is open-source, cross-platform, document publishing software produced by Posit (née RStudio) that allows you to combine the data processing capabilities of programming languages like R, Python, or Julia with the versatile document authoring tools like Markdown, Pandoc, and \(\LaTeX\).

Figure 1: R is just one of the computing languages that can be used in a Quarto document.
Note

If you are already familiar with R Markdown, Quarto is like R Markdown 2.0, and is made by the same people.

In this class, you will learn how to import, wrangle, and visualize data using within Quarto documents. You’ll use Quarto documents to produce production-quality reports in HTML and PDF formats for use in your companion SDS class(es).

Lab Goals

The purpose of this lab is to ensure that you have installed and configured all the software you’ll need to create productive working environment for reproducible scientific computing.

After completing this lab, you should have installed R, RStudio (which also includes Quarto with it), and be able to render this Quarto document!

You will know that you are done when you click on the “Render” button in RStudio, and your results show six check marks, just like the example solutions

Lab instructions

The steps needed to set up a reproducible scientific computing environment depend on the type of computing device you have. Find your computing device in the list below, and click on the link to continue the lab:

If you’re not sure what kind of computing device you have, please ask your instructor for help. If you do not have regular access to a MacBook, Windows PC, or Chromebook, please consult with your instructor. You are encouraged to request an equipment loan from Smith College IT services.