After finishing Question 1, your Quarto document should look like this:
Lab 8: Formatting in Quarto
Question
Add level 1 headings for each of the following project phases to the Quarto document you just created:
- Phase 2: Data Description
- Phase 3: Summarization
- Phase 4: Data Visualization
- Phase 5: Ethics & Impact
Additionally, add the following level 2 headings underneath the Phase 2 heading:
- Topic & Origin
- Data Wrangling
- Description of Variables
Then, render your Quarto document to be sure that your markdown formatting produces the expected output in your HTML document.
---
title: "SDS 100 Data Analysis Project"
author: "Will Hopper"
format: html
---
# Phase 2: Data Description
## Topic & Origin
## Data Wrangling
## Description of Variables
# Phase 3: Summarization
# Phase 4: Data Visualization
# Phase 5: Ethics & Impact
Question
Reproduce the following sentences, including the formatting:
- In the “Topic & Origin” section, write: The main research question my data can be used to address is: _________
- In the “Data Wrangling” section, write: Data wrangling is the process of transforming “raw” data into the format you need for visualization and/or analysis.
- In the “Description of Variables” section, write: The
dplyr::glimpse()
function can be used to get a quick overview of a data set’s contents.
Then, render your Quarto document to be sure that your markdown formatting produces the expected output in your HTML document.
Individual answers will vary, but after finishing Question 2, the ‘Phase 2: Data Description’ section should look like this:
# Phase 2: Data Description
## Topic & Origin
**The main research question** my data can be used to address is: is there a relationship between a mother'smoking during pregnancy and their infants birth weight?
## Data Wrangling
*Data wrangling* is the process of transforming "raw" data into the format you need for visualization and/or analysis.
## Description of Variables
The `dplyr::glimpse()` function can be used to get a quick overview of a data set's contents.
Question
In the “Topic and Origin”, write a sentence that begins “These data were retrieved from _________”, and replace the blank space with a link to a web page where the reader can download your data set.
Then, render your Quarto document to be sure that your markdown formatting produces the expected output in your HTML document.
Individual answers will vary, but after finishing Question 3, the ‘Topic & Origin’ section should look like this:
# Phase 2: Data Description
## Topic & Origin
**The main research question** my data can be used to address is: is there a relationship between a mother'smoking during pregnancy and their infants birth weight?
These data were retrieved from [whopper's secret stash of data](https://bit.ly/ncbirths)
Question
Add an image in your “Topic & Origin” section that is related to the topic of your data set.
Then, render your Quarto document to be sure that your markdown formatting produces the expected output (an image) in your HTML document.
Individual answers will vary, but after finishing Question 4, the ‘Topic & Origin’ section should look like this:
# Phase 2: Data Description
## Topic & Origin
![Infant being weighed by their doctor](https://www.ox.ac.uk/sites/files/oxford/field/field_image_main/Baby.jpg?w=50)
**The main research question** my data can be used to address is: is there a relationship between a mother'smoking during pregnancy and their infants birth weight?
These data were retrieved from [whopper's secret stash of data](https://bit.ly/ncbirths)
Question
Include a footnote in the “Topic & Origin” section that includes information on the year(s) the data were collected. If the years are unknown, then indicate this in your footnote.
Then, render your Quarto document to be sure that your markdown formatting produces the expected output (a footnote) in your HTML document.
Individual answers will vary, but after finishing Question 5, the ‘Topic & Origin’ section should look like this:
# Phase 2: Data Description
## Topic & Origin
![Infant being weighed by their doctor](https://www.ox.ac.uk/sites/files/oxford/field/field_image_main/Baby.jpg?w=50)
**The main research question** my data can be used to address is: is there a relationship between a mother'smoking during pregnancy and their infants birth weight?
These data were retrieved from [whopper's secret stash of data](https://bit.ly/ncbirths)^[The data were collected in the year 2004.]
Question
Edit the header of your final project Quarto document to include the embed-resources: true
option. Feel free to refer back to the end of Lab 3 to see an example of the precise formatting. Remember, all the little spacing and indentation details matter in the YAML header!
Then, render your Quarto document to be sure that your YAML formatting is correct; if your document won’t render, you need to adjust your YAML formatting.
After finishing Question 6, the YAML header should look like this:
---
title: "SDS 100 Data Analysis Project"
author: "Will Hopper"
format:
html:
embed-resources: true
---
Question
Insert a new code chunk in your Quarto document’s “Data Wrangling” section. In this code chunk, write code that will import your project data into R, and store it as an object in your environment.
- If your data is stored in a CSV file, you can load the
tidyverse
package and use theread_csv()
function - If your data is stored in a XLSX (i.e., an Excel file), you can load the
readxl
package and use theread_xlsx()
function. You may need to install this package, since none of the previous labs in SDS 100 have asked you to install it.
Use chunk options to make sure that the code is hidden in when the document is rendered, and the no warnings or messages are displayed when the document is rendered either.
Then, render your Quarto document to be sure that your code and chunk options are formatted correctly.
Individual answers will vary, but after finishing Question 7, the ‘Topic & Origin’ section should look like this:
## Data Wrangling
```
echo: false
message: false
warning false:
ncbirths <- read.csv("https://bit.ly/ncbirths")
```
Question
Write a bulleted list in the “Description of Variables” that describes each of the variables in your data set.
Remember that you only need between 4 and 10 variables in your data set. If your data set has more than 10 variables, be sure to use the dplyr::select()
function in the “Data Wrangling” section to choose at most 10 variables.
Individual answers will vary, but after finishing Question 8, the ‘Description of Variables’ section should look like this:
## Description of Variables
- `fage`: Father's age in years.
- `mage`: Mother's age in years.
- `mature`: Maturity status of mother, based on age.
- `weeks`: Duration of pregnancy in weeks.
- `premie`: Whether the birth was classified as premature (premie) or full-term.
- `visits`: Number of doctor's visits during pregnancy.
- `gained`: Weight gained by mother during pregnancy in pounds.
- `weight`: Weight of the baby at birth in pounds.
- `sex`: Sex of the baby, female or male.
- `smoker`: Status of the mother as a non-smoker or a smoker.