2.3 Lab Activity
2.3.1 Part 1: Understanding R Documentation
Answer the following questions using the documentation for mean():
- What package is
mean()in? - What named arguments does
mean()accept? - Which of the named arguments are required?
- How are these three function calls different?
mean(1:10)mean(1:10, trim = 0)mean(1:10, 0, FALSE)
- What classes of object are acceptable values for
x? - How does
mean()treat missing values inx? - What is the class and length of the output of
mean(x)if:xis a numeric vector of length 50?xis a logical vector of length 100?
Why are there four different calls to as.data.frame() in the Usage section of its documentation?
Answer the following questions using the documentation for read.csv():
- Why does
?read.csv()open the documentation forread.table()? - What is the difference between
read.table(),read.csv(), andread.csv2()?
2.3.2 Part 2: Data Import
Researchers at the Vancouver and Okanagan campuses of The University of British Columbia have conducted a survey of child sleep. The results of this survey are found in the file child-sleep-data.csv. The table below provides names and descriptions for each variable in the data set.
| Name | Description |
|---|---|
| id | A randomly generated participant ID, unique to each participant. |
| age | The age of the child in the study. |
| campus | The campus from which the participant was recruited (“Point Grey” or “Okanagan”). |
| sleep_location | Where does the child sleep? 1 = co-sleeps with parent(s); 2 = sleeps in own room; 3 = shares a room but does not cosleep. |
| sleep | The parent’s estimate of the average hours of sleep the child receives in a 24-hour period. |
Download child-sleep-data.csv and then complete the following steps:
2.3.3 Import and Inspect the Data
- Import child-sleep-data.csv into R; assign it a meaningful name.
- Inspect the
data.frameusinghead(),str(), andView().
2.3.3.1 Convert Categorical Variables to Factors
- Use
as.factor()andfactor()to convert any categorical variables to factors with the appropriate levels and labels.
2.3.3.2 Subsetting and Logical Expressions
Use [] to subset the data frame and find the answers to following questions/complete the tasks described below. You will also need to use is.na(), max(), min(), mean(), and logical expressions.
- What campus was the 48th participant recruited at?
- Return the data for participants who are missing responses on the sleep variable.
- What proportion of participants are missing data for the sleep variable?
- On average, how many hours do the children in this sample sleep per day?
- What is the average daily sleep of children in each of the three conditions?
- What is the participant ID of the participant(s) who reported the most sleep? The least sleep?