2.3 Lab Activity
2.3.1 Part 1: Understanding R Documentation
Answer the following questions using the documentation for mean()
:
- What package is
mean()
in? - What named arguments does
mean()
accept? - Which of the named arguments are required?
- How are these three function calls different?
mean(1:10)
mean(1:10, trim = 0)
mean(1:10, 0, FALSE)
- What classes of object are acceptable values for
x
? - How does
mean()
treat missing values inx
? - What is the class and length of the output of
mean(x)
if:x
is a numeric vector of length 50?x
is a logical vector of length 100?
Why are there four different calls to as.data.frame()
in the Usage section of its documentation?
Answer the following questions using the documentation for read.csv()
:
- Why does
?read.csv()
open the documentation forread.table()
? - What is the difference between
read.table()
,read.csv()
, andread.csv2()
?
2.3.2 Part 2: Data Import
Researchers at the Vancouver and Okanagan campuses of The University of British Columbia have conducted a survey of child sleep. The results of this survey are found in the file child-sleep-data.csv. The table below provides names and descriptions for each variable in the data set.
Name | Description |
---|---|
id | A randomly generated participant ID, unique to each participant. |
age | The age of the child in the study. |
campus | The campus from which the participant was recruited (“Point Grey” or “Okanagan”). |
sleep_location | Where does the child sleep? 1 = co-sleeps with parent(s); 2 = sleeps in own room; 3 = shares a room but does not cosleep. |
sleep | The parent’s estimate of the average hours of sleep the child receives in a 24-hour period. |
Download child-sleep-data.csv and then complete the following steps:
2.3.3 Import and Inspect the Data
- Import child-sleep-data.csv into R; assign it a meaningful name.
- Inspect the
data.frame
usinghead()
,str()
, andView()
.
2.3.3.1 Convert Categorical Variables to Factors
- Use
as.factor()
andfactor()
to convert any categorical variables to factors with the appropriate levels and labels.
2.3.3.2 Subsetting and Logical Expressions
Use []
to subset the data frame and find the answers to following questions/complete the tasks described below. You will also need to use is.na()
, max()
, min()
, mean()
, and logical expressions.
- What campus was the 48th participant recruited at?
- Return the data for participants who are missing responses on the sleep variable.
- What proportion of participants are missing data for the sleep variable?
- On average, how many hours do the children in this sample sleep per day?
- What is the average daily sleep of children in each of the three conditions?
- What is the participant ID of the participant(s) who reported the most sleep? The least sleep?