Document Type

Article

Comments

To cite this article: Daniel Kaplan (2018) Teaching Stats for Data Science, The American Statistician, 72:1, 89-96, DOI: 10.1080/00031305.2017.1398107

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.

Abstract

“Data science” is a useful catchword for methods and concepts original to the field of statistics, but typically being applied to large, multivariate, observational records. Such datasets call for techniques not often part of an introduction to statistics: modeling, consideration of covariates, sophisticated visualization, and causal reasoning. This article re-imagines introductory statistics as an introduction to data science and proposes a sequence of 10 blocks that together compose a suitable course for extracting information from contemporary data. Recent extensions to the mosaic packages for R together with tools from the “tidyverse” provide a concise and readable notation for wrangling, visualization, model-building, and model interpretation: the fundamental computational tasks of data science.

Recommended Citation

Kaplan, Daniel, "Teaching Stats for Data Science" (2017). Faculty Publications. 7.
https://digitalcommons.macalester.edu/mathfacpub/7

Download

Find in your library

Included in

Educational Methods Commons, Numerical Analysis and Scientific Computing Commons

COinS

Faculty Publications

Teaching Stats for Data Science

Document Type

Comments

Abstract

Recommended Citation

Included in

Search

Author Corner

About

Browse

Links

Faculty Publications

Teaching Stats for Data Science

Authors

Document Type

Comments

Abstract

Recommended Citation

Included in

Share

Search

Author Corner

About

Browse

Links