Lab 9: Ethics

Why are we here?

So far this course has focused on the mechanics of using RStudio and building Quarto documents. These skills are critical for engaging with the computational aspects of both statistical and data sciences. But we—the SDS faculty—do not regard these computational skills as the only introductory skills that one needs to begin a statistics or data science project. The other necessary skills include the ability to identify ethical considerations and to be aware of practices that support work that is more ethical and just.

Recently the examination of ethics and ethical issues has become a central topic of discussion and area of research in both statistics and data science. In preparation for this lab, you have read several articles about ethical issues and considerations within data science and statistics. We want to stress that these materials were selected to be the beginning of your journey considering and examining ethics within statistics and data science; these articles alone are not sufficient for a complete education in ethics within statistics and data science.

Lab Goals

The purpose of this lab is to discuss the potential large-scale impacts and harms of data science and statistics.

After completing this lab, you should be able to identify examples of algorithmic bias, differential harm, and issues concerning data privacy. This lab is different than the previous ones because we will be spending half an hour in a structured group discussion, follow by time to work on the final project rotation step for this week.

Required Reading

Leading up to this week’s lab, you engaged with the following materials about algorithmic bias and data privacy:

Lab instructions

Today will be a structured discussion during the first part of class. After an introduction to the activity, your group will discuss the following three questions for 15–20 minutes.

For each question, write at least three ideas, questions, or discussion points that came up during your conversation:

Question

What surprised you most about these readings?

Question

What frustrated you most about these readings?

Question

Statistics and data science are often regarded as fields free from bias. Do you agree that this is the case? Why or why not?

Reporting back

Then we will come together as a class for a short conversation before transitioning to individual work on the final project. In preparation for the full class discussion, there is a Google slide deck linked on Moodle, and each group will have a slide to document their conversation. Using the slides you created, we’ll spend another 10–15 minutes reporting back to the class about what we discussed in our groups.

One minute essay

Finally, spend one minute writing down an answer to the following question:

Question

What came up during the wrap up session that was new to you? In other words, what points were discussed in the wrap up that were not part of your discussion?

Next Steps

Step 1: Complete the reflection on Moodle

Write a reflection on the readings and discussion. Save it locally, and then paste it in the Moodle question. Please write at least 150 words. Submissions that are on topic and at least 150 words will receive credit. This counts towards your Weekly Lab Quizzes token.

Questions to consider:

  • What surprised or frustrated you about the readings?

  • Do you have recommendations for how to mitigate potential bias in the practice of statistics and data science?

  • What questions might you ask of yourself, the people you work with, or your data to encourage responsible use of data?

  • How has your perspective changed through the course of the readings and discussion?

Step 2: Begin Phase 3 of the final project

At this point, you should have:

  1. Completed your Phase 2 data wrangling and data description.
  2. Sent the next person in your rotation schedule all the materials they need to replicate your work Phase 2 work. This means you should have sent them your:
    • Project Quarto document (which you started during the previous lab)
    • Data file (if needed)
    • Image file (if needed)
  3. Received all the Phase 2 materials (listed above) from your group partner that you need to begin Phase 3

If any of the items above have not been completed, they should be attended to immediately. If your group partner does not send you all the materials you need to replicate their Phase 2 work in a timely fashion, please contact your instructor.

The next step is of the project focuses on computing and reporting summary statistics. Be sure to read the expectations for Phase 3 carefully, and use any class time remaining to start working on Phase 3.

Step 3: Complete the Data Wrangling Coding Fluency Quiz (if necessary)

If you received a passing grading on your first Data Wrangling coding fluency quiz, congratulations! You have earned your Data Wrangling token, and no further action is necessary.

If you have not earned your Data Wrangling token yet, you have another opportunity to earn it by taking the second Data Wrangling coding fluency quiz on Moodle during this week. Be sure to take this opportunity to earn this token!

Step 4: Optional reading

To prepare for the next lab on referencing images, tables and figures in your Quarto documents, we recommend reading the Cross References page of the official Quarto documentation.