Title: | Teaching Data for Statistics and Data Science |
---|---|
Description: | Provides data sets for teaching statistics and data science courses. It includes a sample of data from John Edmund Kerrich's famous coinflip experiment. These are data that I used for teaching SOC 4015 / SOC 5050 at Saint Louis University (SLU). The package also contains an R Markdown template with the required formatting for assignments in my courses SOC 4015, SOC 4650, SOC 5050, and SOC 5650 at SLU. |
Authors: | Christopher Prener [aut, cre] |
Maintainer: | Christopher Prener <[email protected]> |
License: | GPL-3 |
Version: | 0.5.2 |
Built: | 2024-11-21 06:21:33 UTC |
Source: | https://github.com/chris-prener/testdriver |
A data set containing model year 2017 vehicles for sale in the United States.
data(auto17)
data(auto17)
A data frame with 1216 rows and 21 variables:
DOT vehicle ID number
vehicle manufacturer
vehicle brand
vehicle name
vehicle type, numeric
vehicle type, string
fuel economy, city
fuel economy, highway
fuel economy, combined
poor fuel economy
fuel, abbrev.
fuel, full
estimated fuel cost
engine displacement
transmission, full
transmission, abbrev.
number of gears
number of cylinders
air aspiration method
vehicle drive type, abbrev.
vehicle drive type, full
https://www.fueleconomy.gov/feg/download.shtml
str(auto17) head(auto17)
str(auto17) head(auto17)
A data set containing time series data by country for estimated under-5, infant, and neonatal mortality rates.
data(childMortality)
data(childMortality)
A data frame with 28982 rows and 6 variables:
two-letter country code
full name of country
name of continent
type of mortality rate - infant_MR
, child_MR
, or under5_MR
year of estimate
estimated mortality rate
https://childmortality.org
str(childMortality)
str(childMortality)
A data set containing data on work, salary, and education from the 2014 General Social Survey. Missing data are explicitly identified with NAs and all data are represented as factors when appropriate.
data(gss14)
data(gss14)
A data frame with 2538 rows and 19 variables:
GSS year for this respondent
Total family income (2006 version)
Rs family income when 16 yrs old
Region of residence, age 16
Race of respondent
Respondents sex
Spouses highest degree
Mothers highest degree
Fathers highest degree
Rs highest degree
Number of children
Spouse self-emp. or works for somebody
Number of hrs spouse worked last week
Marital status
R self-emp or works for somebody
Number of hours worked last week
Labor force status
Respondent id number
Ballot used for interview
https://gssdataexplorer.norc.org
str(gss14) head(gss14)
str(gss14) head(gss14)
A data set containing data on work, salary, and education from the 2014 General Social Survey. Missing data are not explicitly identified with NAs and all data are represented numerically instead of as factors when appropriate.
data(gss14_simple)
data(gss14_simple)
A data frame with 2538 rows and 19 variables:
GSS year for this respondent
Total family income (2006 version)
Rs family income when 16 yrs old
Region of residence, age 16
Race of respondent
Respondents sex
Spouses highest degree
Mothers highest degree
Fathers highest degree
Rs highest degree
Number of children
Spouse self-emp. or works for somebody
Number of hrs spouse worked last week
Marital status
R self-emp or works for somebody
Number of hours worked last week
Labor force status
Respondent id number
Ballot used for interview
https://gssdataexplorer.norc.org
str(gss14_simple) head(gss14_simple)
str(gss14_simple) head(gss14_simple)
A data set containing 2,000 trials of coin flips from statistician John Edmund Kerrich's 1940s experiments while imprisoned by the Nazis during World War Two.
data(kerrich)
data(kerrich)
A data frame with 1216 rows and 21 variables:
trial
outcome of each trial; TRUE = heads, FALSE = tails
cumulative mean of outcomes
https://stats.stackexchange.com/questions/76663/john-kerrich-coin-flip-data/77044#77044
https://books.google.com/books/about/An_experimental_introduction_to_the_theo.html?id=JBTvAAAAMAAJ&hl=en
https://en.wikipedia.org/wiki/John_Edmund_Kerrich
str(kerrich) if (require("ggplot2")) { ggplot(data = kerrich) + geom_hline(mapping = aes(yintercept = .5, color = "p(heads)")) + geom_line(mapping = aes(x = id, y = average)) + ylim(0,1) }
str(kerrich) if (require("ggplot2")) { ggplot(data = kerrich) + geom_hline(mapping = aes(yintercept = .5, color = "p(heads)")) + geom_line(mapping = aes(x = id, y = average)) + ylim(0,1) }
The goal of testDriveR
is to provide data sets for teaching
statistics and data science courses. This package includes a sample of
data from John Edmund Kerrich's famous coinflip experiment. These are data
that I use for teaching SOC 4015 / SOC 5050
at Saint Louis University.
There are currently five data sets that are included in the package:
auto17
- A data set containing model year 2017 vehicles
for sale in the United States
childMortality
- A data set containing childhood mortality
time series data by country from UNICEF
gss14
- A data set containing a selection of variables
related to work and education from the 2014 General Social Survey
gss14_simple
- A simple version of gss14
without
factors created and without missing data explicitly declared
kerrich
- A data set containing 2000 trials of coin flips by
John Edmund Kerrich