Lab 1: Getting Started

Lab instructions for computers

These instructions apply to Apple MacBook computers only. If you are not using an Apple MacBook, please return to the previous page and choose the correct computing device from the list.

Installing new software

All of your day-to-day work in this class will be done inside a program called RStudio, which is an application that makes it easier to use the R programming language. To use this program, you’ll need to download and install both R and RStudio program on your computer. Follow the instructions below to install them, and then verify your installation is set-up correctly.

Step 1: Install R

R is a programming language and computing environment specialized for statistical analysis and data manipulation. It’s commonly used for performing statistical tests, creating data visualizations, and writing data analysis reports. Despite focusing on statistics, it’s a full-fledged programming language, and relatively easy to learn.

To figure out which version of R you need for your computer, click on the small apple icon in the top-left corner of your screen and choose “About this Mac” from the menu

Look at the “Processor” line in the pop-up, and take note of whether your computer uses an Intel processor, or an Apple M processor

Example of an Apple M processor

Example of an Intel processor

Then, go to https://cloud.r-project.org/bin/macosx/

  • If your computer uses an Intel processor, click on the file titled R-4.4.1-x86_64.pkg
  • If your computer uses an Apple M processor, click on the file titled R-4.4.1-arm64.pkg

This will download the R installer to your computer’s download folder. Go to your download folder, and double-click on the R-4.4.1 installer you just downloaded. Follow the prompts on the screen to finish installing R. You can safely accept all the default settings without changing anything.

Note

If you want to see more visually detailed instructions for how to install R, you can watch this video tutorial.

Note that some of the software versions number might change (e.g., R 4.1 instead of 4.4.1), but the steps you should follow will be exactly the same.

What if I cannot install R on my computer?

Sometimes, R cannot be installed on someones computer. For example, if their operating system is out of date, or they do not have enough space on their computer’s hard drive to fit the program.

If R cannot be installed on your computer, please notify your instructor. The best solution in the short term is to use the RStudio Server instead of installing R on your computer. Click here to follow the instructions for setting up your account on the RStudio Server.

Step 2: Install RStudio

RStudio is an integrated development environment for reproducible scientific computing that caters to the R programming language. You will use it extensively in all of your SDS classes.

Instructions

  1. Download the latest, free version of RStudio Desktop. Be sure to get the version that is appropriate for your operating system.
  2. Install RStudio Desktop by launching the installer after it downloads. You can accept all the defaults during installation.

After launching the RStudio installer, make sure to drag the RStudio icon into your applications folder!

Note

If you want to see more visually detailed instructions for how to install RStudio, you can watch this video tutorial

Note that some of the software versions number might change, the steps you should follow will be exactly the same.

Verification

Next, let’s open RStudio, to verify that R and RStudio have been installed correctly. Here is what you should do to open RStudio:

  1. Open a new Finder window
  2. Navigating to the Applications folder (which should be in the list of locations along the left pane)
  3. Scrolling down until your find the RStudio icon, and double-click on the RStudio icon. Be sure to click on the icon for RStudio, not the icon for R.

If you can’t locate the icon this way, you can search for the word RStudio in the Spotlight search bar.

When RStudio opens, you should see a window like this:

Note

You may also see a pop-up window that looks like this when you open RStudio:

You can safely choose “Not Now” (we will not use the git command in this class) but the pop-up will continue to appear in the future.

You can also choose “Install” (which will stop this pop-up message from appearing in the future) but the installation process be very lengthy (likely between 5 and 30 minutes, depending on internet speeds).

What if I cannot install RStudio on my computer?

Sometimes, RStudio cannot be installed on someones computer. For example, if their operating system is out of date, or they do not have enough space on their computer’s hard drive to fit the program.

If RStudio cannot be installed on your computer, please notify your instructor. The best solution in the short term is to use the RStudio Server instead of installing RStudio on your computer. Click here to follow the instructions for setting up your account on the RStudio Server.

The RStudio application window is divided into four “panes”, which are labeled and color-coded in the diagram below

  1. The Console pane
  2. The Editor pane (not shown)
  3. The Environment pane
  4. The “Miscelleaneous” pane

Question

Copy and paste the code below into the console pane, and press the Enter key to run the code. If your installation of R is working correctly, you should see the R version printed out, and a message indicating that your version of R is up do date (like the message seen below).

r_version_info <- R.Version()
if (all(as.numeric(r_version_info$major) >= 4, 
        as.numeric(r_version_info$minor) > 4
        )
    ) {
  msg <- "✔ Your version of R is up to date!"
} else {
  msg <- "✖ Your version of R is out of date; You should update to version 4.4 now."
}
cat(r_version_info$version.string, msg, sep="\n")
R version 4.4.2 (2024-10-31)
✔ Your version of R is up to date!

Installing packages

Like many modern programming languages, R is modular, meaning that it relies on packages to provide additional functionality. There are thousands of R packages hosted on CRAN, and many more hosted on GitHub. In this course, we will focus on a handful of popular, well-crafted, useful packages.

In RStudio, the Packages tab displays a searchable list of packages that are installed on your computer.

You should get comfortable checking which packages you have installed, and installing new packages. You only have to install a package once.

Install rstudio.prefs and customize your RStudio

RStudio has hundreds of options that can be configured so that each user can customize its behavior to their preferences. Unfortunately, several of its configuration options have default settings that make it more difficult to conduct reproducible data analyses. These settings can be changed using the Global Options GUI, but hunting down each setting and changing them one at a time is slow and tedious.

Instead, we’ll use the rstudio.prefs package, which will allow us to change all the settings at once by executing a few R commands in the console. But first, we need to install the rstudio.prefs package!

Instructions

  1. Install the rstudio.prefs package.

    Open the “Packages” tab in the bottom-right pane of RStudio, and click the “Install” button along the top menu bar (see Figure 1).

    In the pop-up window, type rstudio.prefs into the text box, and press the Install button to start the package installation (see Figure 2).

Figure 1
Figure 2
  1. Configure RStudio

    Copy and paste the code below into the console, and press the Enter key to run the code. This code will change several of RStudio’s default configuration options to make the program more user-friendly.

    Before changing your settings, R will print out your pending changes, and ask if you want to continue. You can indicate “Yes” by typing a y into the console, and pressing the Enter key.

    library(rstudio.prefs)
    
    use_rstudio_prefs(
      save_workspace = "never",
      load_workspace = FALSE,
      restore_last_project = FALSE,
      restore_source_documents = FALSE,
      check_for_updates = FALSE,
      color_preview = FALSE,
      rmd_viewer_type = "pane",
      rmd_chunk_output_inline   = FALSE
    )

    If you get an error message in your console saying Error in library(rstudio.prefs) : there is no package called ‘rstudio.prefs’, return to Step 1 in this list and make sure you have finished installing the rstudio.prefs package.

Verification

Question

Copy and paste the R code below into the console and press the Enter key to run the code. If you have configured RStudio correctly by following the instructions above, you should see a message that says “✔ RStudio is correctly configured!”.

rstudio_config <- jsonlite::fromJSON(paste0(rstudio.prefs::rstudio_config_path(),
                                            "/rstudio-prefs.json")
                                     )
options_set <- c(
  rstudio_config$save_workspace == "never",
  rstudio_config$load_workspace == FALSE,
  rstudio_config$restore_last_project == FALSE,
  rstudio_config$restore_source_documents == FALSE,
  rstudio_config$check_for_updates == FALSE,
  rstudio_config$color_preview == FALSE,
  rstudio_config$rmd_viewer_type == "pane",
  rstudio_config$rmd_chunk_output_inline == FALSE
  )

if (all(options_set)) {
  cli::cli_alert_success("RStudio is correctly configured!")
} else {
  cli::cli_alert_danger("RStudio settings have not been correctly configured.")
}

Install tidyverse and usethis

The tidyverse is a meta-package that installs eight other commonly-used packages. The tidyverse is developed by RStudio and has become a popular way to use R (Wickham et al. 2019). In this course, we will be using the tidyverse extensively.

usethis is package that helps set up projects and other packages. We’ll only be using it in this lab to verify our setup.

Instructions

Install the tidyverse and usethis packages by following the same steps you followed above to install the rstudio.prefs package.

Verification

Question

Copy and paste the R code below into the console and press the Enter key to run the code. If an appropriate version of the tidyverse package is properly installed, you should see two messages: “✔ tidyverse is installed and relatively up-to-date” and “✔ usethis is installed.”

has_tidyverse <- suppressPackageStartupMessages(require(tidyverse))
has_usethis <- suppressPackageStartupMessages(require(usethis))

if (has_tidyverse) {
  tidyverse_version <- installed.packages() |>
    as_tibble() |>
    filter(Package == "tidyverse") |>
    pull(Version) 
  
  if (tidyverse_version > 2.0) {
    cli::cli_alert_success("tidyverse is installed and relatively up-to-date.")
  } else {
    cli::cli_alert_danger("tidyverse is installed but it is not up-to-date. Please update your packages.")
  }
} else {
  ui_oops("tidyverse could not be loaded.")
}
if (has_usethis) {
  cli::cli_alert_success("usethis is installed")
} else {
  ui_oops("usethis could not be loaded")
}

Create a project environment for SDS 100

Now that you have R and RStudio installed, we are going to set up your working environment to maximize your productivity.

The work that you do in RStudio should be organized into Projects. Working within projects allows you to switch contexts safely, and keep your work organized. We recommend that you have at least two projects:

  • one project for this class (SDS 100)
  • one project for the companion SDS class you are taking

You can switch between Projects in RStudio at any time using the Projects dropdown menu in the upper-right corner of your screen.

Instructions

Follow these instructions to create a new project in RStudio named “SDS 100” inside your Documents folder.

  1. Click the button labelled “Project:” in the top right of RStudio.

    Step 1
    Figure 3
  2. Select “New Project” from the dropdown menu. Then, click “New Directory” in the pop-up window that opens.

    Step 2

    Step 3
    Figure 4
  3. Click “New Project.” A window will appear with two fields. In the first, “Directory name:”, put SDS 100. For the second, click the button to the right that says “Browse.”

    Step 4

    Step 5
    Figure 5
  4. In the pop-up window that appears, navigate to your Documents folder, then select it and click “Open.” This will fill out the second field in the pop-up window. Ensure that both fields are correct, then click “Create Project.”

    Step 6

    Step 7
    Figure 6
Warning

Warning: We strongly recommend that you not place your Project in any of the following places:

  • Your Downloads folder
  • Your Desktop
  • A temporary (i.e., “temp”) folder

We also recommend that you not place your Project in a cloud-based storage location (e.g., your OneDrive folder or your iCloud folder, etc.)

Verification

Question

Copy and paste the code below into your R console, and press the Enter key to run the code. If your project is set up correctly, you should see a message saying “Your project is in a good place” (like the one shown in the image below).

project <- tryCatch(
  usethis:::proj_path(),
  error = function(e) {
    if (class(e)[1] == "usethis_error") {
      FALSE
    } else {
      stop(e) # rethrow package not found error!
    }
  }
)

if (!isFALSE(project)) {
  cli::cli_alert_success("Project found at: {.path {project}}. Your project is in a good place.")
} else {
  cli::cli_alert_danger(
    "No project environment detected. Make sure you:\n
     1. Create and open an R project for SDS 100.
     2. Move your Quarto file into your project folder before rendering it
    "
  )
}

Your First Quarto Document

Quarto is a software program that can be used inside of RStudio which allows you write narrative text and R code together within the same document. A Quarto document is a bit like a Microsoft Word file or Google Doc that has an R console built right into it. This allows you to create a “final product” from all your data analysis work that contains all your R code and its output (like tables and figures) and all your written explanations. You can this final product and share it with other people, so they can see exactly what you did, and understand what your results mean.

Now, it’s time for you to open and render your very first Quarto document! You’ll be using Quarto documents to complete your lab work throughout this course, so we’ll start getting used to the workflow of rendering and turning in your final products.

Instructions

  1. Download this Quarto file, but don’t open it yet.
  2. Open your computer’s file explorer, and navigate to your Downloads folder. Locate the lab_01_setup.qmd file you downloaded, and move it from your Downloads folder to your SDS 100 project folder.
  3. Double-click on the lab_01_setup.qmd file to open it in RStudio.
  4. Click on the render button button near the top middle of the editor pane.
  5. Inspect the output in the Viewer Pane. How many checkmarks do you see? You should see six check marks.

Finding your rendered document

Open a new window in your computer’s file explorer program (Explorer if you’re on Windows, or Finder if you’re on a Mac), and navigate to the folder where you saved your SDS 100 R Project.

In this folder you should see a filed named lab_01_setup.html. This is the output from the Quarto document you just rendered. If you double-click this file, it should open in your web browser, and you should see the same thing you saw in the RStudio Viewer pane.

Next Steps

Step 1: Compare solutions

Each week, you will compare your solutions to ours. Does your lab_01_setup.html have the same outputs (checkmarks/messages) as the example solutions? The styling in the solutions file may look a bit different than yours, but the output should be the same.

Step 2: Complete the Moodle Quiz

Complete the Moodle quiz for this lab (Weekly Quiz 1). You can complete the quiz any time before 8:00 am on the day of the next course meeting.

Optional reading

To prepare for next week’s lab on working in R, we recommend reading Chapter 2: R basics and workflows.

References

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.