OpenIntro Statistics

14 min read Original article ↗

OpenIntro Statistics is a dynamic take on the traditional curriculum, being successfully used at Community Colleges to the Ivy League

all

videos

slides

labs

other

OpenIntro Statistics is recommended for college courses and self-study.

Getting Started

Amazon KDP raised book print prices by ≈40% in 2023. These changes will net Amazon over $50,000 per year from sales on OpenIntro books, and we expect that OpenIntro will lose money as a result of reduced sales. We are continuing to explore alternative printers to Amazon that provide better quality books as well as selling more books outside of Amazon.

All of our website / resource links to Amazon are affiliate links. When you shop on Amazon using these links, we receive a small commission at no extra charge to you.

 FREE -- OpenIntro Statistics PDF

If you want to skip the optional contribution, set the price to $0

 $25 -- B&W paperback

Available on Amazon and in select bookstores

 $40 -- OpenIntro Statistics, color paperback

Color internal pages, while the B&W version is gray-scaled

 FREE -- Book PDF Best for Screen Readers

More detailed table of contents, extra text to ease aid navigation (e.g. explicitly noting when an example starts and ends), and "alt text" for all images. Note that page numbers do *not* align with the original PDF, so please use section, figure, example, et al numbers for referencing and navigation. Please send feedback via the openintro.org/contact page.

Learning objectives

What we hope students will learn from these resources

Data sets

List of data sets and the option to download files

Where to find more data sets

An incredible list of data organized by Shonda Kuiper

Send feedback or report a typo

We appreciate feedback, both positive and negative

List of known textbook typos

Review textbook typos and clarifications


Translations + Other International Distribution

For those using a translated version, please send your warm wishes to the team behind these translations! We deeply appreciate their contributions to the community!

A Japanese translation has been created by a team of Japanese faculty! This translation is available below in both PDF (on Dr. Kunitomo's page) and as an affordable paperback (via the Japanese Statistical Association).

 FREE -- Japanese translation of OpenIntro Statistics PDF

Translation by Naoto Kunitomo, Yasushi Yoshida, & Atsuyuki Kogure

 Japanese translation, B&W paperback for ¥1980

Translated by Naoto Kunitomo, Yasushi Yoshida, & Atsuyuki Kogure

A Chinese translation is under development by Shiyao Wang and Xueqi Li! A recent draft of the progress is available below.

 FREE -- Chinese translation of Ch 1-6 (PDF)

Translation by Shiyao Wang & Xueqi Li

Follow the Chinese translation updates on WeChat

Leads to a WeChat page

A Vietnamese translation is currently under development by a volunteer team led by Associate Professor Do Thi Thanh Toan. A recent draft of the progress is available below. If you notice any typos or errors that need correction, please feel free to contact our corresponding member, Mr. Thanh Hai Pham (email: thanh.ph.hmu@gmail.com). We will do our best to respond promptly.

The team members working on the Vietnamese translation are: Associate Professor Do Thi Thanh Toan; Dr. Le Xuan Hung; Dr. Dinh Thai Son; Dr. Luu Ngoc Minh; Mr. Nguyen Trung Kien (BSc); Mrs. Tran Cat Khanh (BSc); Mr. Vu Gia Huan (MD); Mr. Ngo Gia Huy (MD) (email: huygiango3001@gmail.com); and Mr. Thanh Hai Pham (MSc) - Corresponding member (email: thanh.ph.hmu@gmail.com).

FREE -- Vietnamese translation, Ch 1 (PDF)

Translation by a team led by Professor Do Thi Thanh Toan

FREE -- Vietnamese translation, Ch 2 (PDF)

Translation by a team led by Professor Do Thi Thanh Toan

FREE -- Vietnamese translation, Ch 3 (PDF)

Translation by a team led by Professor Do Thi Thanh Toan

FREE -- Vietnamese translation, Ch 4 (PDF)

Translation by a team led by Professor Do Thi Thanh Toan

FREE -- Vietnamese translation, Ch 5 (PDF)

Translation by a team led by Professor Do Thi Thanh Toan

The paperbacks linked below are for the English version, which is also available in several countries via Amazon.

 Click to see all international options

Paperbacks for Canada, UK, India, Germany, and more

 English B&W paperback on Amazon.co.jp

See also the Japanese translation option

 Amazon.ca -- B&W paperback

Hello, northern neighbor

 Amazon.co.uk -- B&W paperback

Hello, from across the Atlantic

 Notion Press (India) -- B&W paperback

Price includes shipping cost

 Amazon.de -- B&W paperback

Book is in English

 Amazon.fr -- B&W paperback

Book is in English

 Amazon.es -- B&W paperback

Book is in English

 Amazon.it -- B&W paperback

Book is in English


Teachers: General Resources

Resources for teachers, some of which are restricted to Verified Teachers only. Slides, labs, and other resources may also be found in the corresponding chapter sections below.

Learn about Teacher Verification

Benefits, options to apply, and the verification process

Request a textbook desk copy (US only)

Available to Verified Teachers, click here to apply for access

OpenIntro Statistics exercise solutions

Available to Verified Teachers, click here to apply for access

Bookstore Ordering (bulk)

Wholesale purchase options

MyOpenMath: online course software

Free course software, OpenIntro course templates are available

MyOpenMath: setting up an OpenIntro course

Course templates exist for some OpenIntro books

OpenIntro Statistics, info on past editions

Content, prices, and availability details

Teachers page with additional resources

Some public resources, others restricted to Verified Teachers


Teachers: Sample Syllabi


Teachers: Sample Exams


What is Statistics?


Companion Notebook


Chapter 1: Intro to Data

 Videos for each section

Introduction to Data: 5 videos

 1.1 - Using stents to prevent strokes

Real case study with a surprising finding

 1.2 - Data basics

Typical data structures and properties

 1.3A - Data collection principles

Thoughtful data collection is critical to learning from data

 1.3B - Sampling principles and strategies

Different ways to sample from a population

 1.4 - Experiments

Basic principles of experimental design

 Slides for each section  

Google Slides & LaTeX variants available

  Slides 1 - Intro to data

LaTeX slides for full chapter on Github

  Slides 1.1 - Intro to data, case study

Google Slides version, can export to Powerpoint

  Slides 1.2 - Data Basics

Google Slides version, can export to Powerpoint

  Slides 1.3 - Sampling principles and strategies

Google Slides version, can export to Powerpoint

  Slides 1.4 - Experiments

Google Slides version, can export to Powerpoint

 Lab - Intro to Statistical Software

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, SAS, Stata


Chapter 2: Summarizing Data

 Videos for each section

Summarizing Data: 3 videos

 2.1 - Examining numerical data

Mean, standard deviation, histograms, box plots, and more

 2.2 - Considering categorical data

Table proportions, bar graphs, mosaic plots, and more

 2.3 - Case study

Early inference ideas: testing using randomization

 Slides for each section  

Google Slides & LaTeX variants available

  Slides 2 - Summarizing data

LaTeX slides for full chapter on Github

  Slides 2.1 - Examining numerical_data

Google Slides version, can export to Powerpoint

  Slides 2.2 - Considering categorical data

Google Slides version, can export to Powerpoint

  Slides 2.3 - Case study

Google Slides version, can export to Powerpoint

 Lab - Introduction to data

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, Python, SAS, Stata

Class Activity: Descriptive Measures

Collecting and exploring Instagram data

Weighted mean

Supplemental section: How and when to use weighting


Chapter 3: Probability

 Videos for some sections

Probability: 3 videos

 3.1 - Defining probability

Core concepts, explained in detail

 3.2 - Probability trees

Useful tool for conditional probability

 Would you take this bet?

Thinking through probability and risk

 Slides for each section  

Google Slides & LaTeX variants available

  Slides 3 - Probability

LaTeX slides for full chapter on Github

  Slides 3.1 - Defining probability

Google Slides version, can export to Powerpoint

  Slides 3.2 - Conditional probability

Google Slides version, can export to Powerpoint

  Slides 3.3 - Sampling from a small population

Google Slides version, can export to Powerpoint

  Slides 3.4 - Random variables

Google Slides version, can export to Powerpoint

  Slides 3.5 - Continuous distributions

Google Slides version, can export to Powerpoint

 Lab - Probability

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, Python, SAS, Stata


Chapter 4: Distributions

 Videos for some sections

Distributions: 3 videos

 4.1 - Normal distribution

Core concepts and several examples

 4.3A - Binomial distribution

Introduction to the binomial distribution

 4.3B - Normal approximation to binomial

A useful technique for some binomial situations

 Slides for each section  

Google Slides & LaTeX variants available

  Slides 4 - Distributions

LaTeX slides for full chapter on Github

  Slides 4.1 - Normal distributions

Google Slides version, can export to Powerpoint

  Slides 4.2 - Geometric distribution

Google Slides version, can export to Powerpoint

  Slides 4.3 - Binomial distribution

Google Slides version, can export to Powerpoint

  Slides 4.4 - Negative binomial distribution

Google Slides version, can export to Powerpoint

  Slides 4.5 - Poisson distribution

Google Slides version, can export to Powerpoint

 Lab - Normal distribution

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, Python, SAS, Stata

Class Activity: Sampling Distributions

As presented at Women in Stat and DS Conference

Normal distribution calculator

Online tool for normal distribution calculations


Chapter 5: Foundations for Inference

 Videos for each section

Foundations for Inference: 4 videos

 5.1 - Variability of the sample proportion

Introduces the Central Limit Theorem

 5.2 - Confidence intervals

Reporting a range, not just a point estimate

 5.3 - Hypothesis testing

Introduced using numerical data (means)

 Inference for other estimators

Generalizing the tools of inference

Why do we use 0.05 as a significance level?

Inquiring minds want to know -- let's explore!

 Slides for each section  

Google Slides & LaTeX variants available

  Slides 5 - Foundations for inference

LaTeX slides for full chapter on Github

  Slides 5.1 - Point estimates and sampling variability

Google Slides version, can export to Powerpoint

  Slides 5.2 - Confidence intervals

Google Slides version, can export to Powerpoint

  Slides 5.3 - Hypothesis testing

Google Slides version, can export to Powerpoint

 Lab - Intro to inference

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, Python, SAS, Stata

 Lab - Confidence levels

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, Python, SAS, Stata

One-page inference guide

Covers one-sample and diff of means and proportions


Chapter 6: Inference for Categorical Data

 Videos for each section

Inference for categorical data: 3 videos

 6.1 + 6.2 - Inference for proportions

Covers both 1 and 2 proportion scenarios

 6.3 - Testing for goodness of fit

Chi-square test for one-way tables

 6.4 - Chi-square for two-way tables

Testing for homogeneity or independence

 Slides for each section  

Google Slides & LaTeX variants available

  Slides 6 - Inference for categorical data

LaTeX slides for full chapter on Github

  Slides 6.1 - Inference for a single proportion

Google Slides version, can export to Powerpoint

  Slides 6.2 - Inference for a difference of two props

Google Slides version, can export to Powerpoint

  Slides 6.3 - Testing goodness of fit using chi-square

Google Slides version, can export to Powerpoint

  Slides 6.4 - Testing for independence in 2-way tables

Google Slides version, can export to Powerpoint

 Lab - Inference for categorical data

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, Python, SAS, Stata

Hypothesis testing for small sample proportions

Supplemental section: When the success-failure condition fails

Online app for Central Limit Theorem for proportions

This is a Shiny app for exploration


Chapter 7: Inference for Numerical Data

 Videos for each section

Inference for categorical data: 8 videos

 7.1A - t-distribution

Useful new distribution for inference for means

 7.1B - Inference for one mean

Covers confidence intervals and hypothesis tests

 7.2 - Paired data

Special case for difference of two means

 7.3 - Difference of two means

When we have two independent samples

 7.4 - Power calculations

Covers the scenario of the difference of two means

 7.5A - Intro to ANOVA

Key concepts and ideas

 7.5B - Conditions for ANOVA

How to check if ANOVA is reasonable

 7.5C - Multiple comparisons

How we determine which groups are different

 Slides for each section  

Google Slides & LaTeX variants available

  Slides 7 - Inference for numerical data

LaTeX slides for full chapter on Github

  Slides 7.1 - One-sample means with the t-distribution

Google Slides version, can export to Powerpoint

  Slides 7.2 - Paired data

Google Slides version, can export to Powerpoint

  Slides 7.3 - Difference of two means

Google Slides version, can export to Powerpoint

  Slides 7.4 - Power calculations for difference of means

Google Slides version, can export to Powerpoint

  Slides 7.5 - Comparing many means with ANOVA

Google Slides version, can export to Powerpoint

 Lab - Inference for numerical data

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, Python, SAS, Stata

Class Activity: Correlation

Students compare and correlate movie ratings

Sample size and power (one-sample)

Supplemental section: on power in the one-sample scenario

Better understand ANOVA calculations

Supplemental section: Details behind ANOVA

Online app for Central Limit Theorem for means

This is a Shiny app for exploration


Chapter 8: Introduction to Linear Regression

 Videos for each section

Intro to linear regression: 5 videos

 8.1 - Ideas of fitting a line

Also covers residuals and correlation

 8.2 - Fitting a least squares regression line

The notion of a "best fitting" line

 8.2 - Detailed Overview: Fitting a least squares regression line

Section 8.2 textbook walkthrough by author

 8.3 - Types of outliers in regression

Points of high leverage and influential points

 8.4 - Inference for linear regresion

Using the t-distribution for inference in regression

 Slides for each section  

Google Slides & LaTeX variants available

  Slides 8 - Linear regression

LaTeX slides for full chapter on Github

  Slides 8.1 - Line fitting, residuals, and correlation

Google Slides version, can export to Powerpoint

  Slides 8.2 - Fitting a line by least squares regression

Google Slides version, can export to Powerpoint

  Slides 8.3 - Types of outliers in linear regression

Google Slides version, can export to Powerpoint

  Slides 8.4 - Inference for linear regression

Google Slides version, can export to Powerpoint

 Lab - Linear regression

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, Python, SAS, Stata


Chapter 9: Multiple and Logistic Regression

 Videos for some sections

Multiple & logistic regression: 4 videos

 9.1 - Multiple regression basics

Using many predictors in a single model

 9.2 - Model selection

How to determine which variables to keep in the model

 9.3 - Checking conditions using graphs

Several key graphs to assessing a multiple regression model

 9.5 - Intro to logistic regression

When the outcome is binary (e.g. yes/no)

 Slides for each section  

Google Slides & LaTeX variants available

  Slides 9 - Multiple + logistic regression

LaTeX slides for full chapter on Github

  Slides 9.1 - Intro to multiple regression

Google Slides version, can export to Powerpoint

  Slides 9.2 - Model selection

Google Slides version, can export to Powerpoint

  Slides 9.3 - Checking model conditions using graphs

Google Slides version, can export to Powerpoint

  Slides 9.5 - Intro to logistic regression

Google Slides version, can export to Powerpoint

 Lab - Multiple regression

Software: R (Base), R (Tidyverse), Rguroo, Jamovi, JASP, Python, SAS, Stata

More inference for linear regression

Supplemental section: Confidence and prediction intervals

Interaction terms

Supplemental section: When predictors impact outcomes in complex ways

Regression for nonlinear relationships

Supplemental section: When a straight line doesn't make sense

Online app for better understanding regression

This is a Shiny app for exploration


More Resources


Sample Student Projects


More Free Books