What you'll learn

The concepts necessary to define estimates and margins of errors of populations, parameters, estimates and standard errors in order to make predictions about data
How to use models to aggregate data from different sources
The very basics of Bayesian statistics and predictive modeling

Course description

Statistical inference and modeling are indispensable for analyzing data affected by chance, and thus essential for data scientists. In this course, you will learn these key concepts through a motivating case study on election forecasting.

This course will show you how inference and modeling can be applied to develop the statistical approaches that make polls an effective tool and we'll show you how to do this using R. You will learn concepts necessary to define estimates and margins of errors and learn how you can use these to make predictions relatively well and also provide an estimate of the precision of your forecast.

Once you learn this you will be able to understand two concepts that are ubiquitous in data science: confidence intervals, and p-values. Then, to understand statements about the probability of a candidate winning, you will learn about Bayesian modeling. Finally, at the end of the course, we will put it all together to recreate a simplified version of an election forecast model and apply it to the 2016 election.

Learn More

Instructors

Rafael Irizarry

Professor of Biostatistics, T.H. Chan School of Public Health

Data Science: Probability

Learn probability theory — essential for a data scientist — using a case study on the financial crisis of 2007–2008.

Free^*

Available now

Stained glass windows arranged in a spiraling shape

Data Science

Online

Data Science: Capstone

Show what you’ve learned from the Professional Certificate Program in Data Science.

Free^*

Available now

Young man sitting at desk with computer and a thought bubble saying, "What did that code do?"

Data Science

Online

Principles, Statistical and Computational Tools for Reproducible Data Science

Learn skills and tools that support data science and reproducible research, to ensure you can trust your own research results, reproduce them yourself, and communicate them to others.

Free^*

8 weeks long

Available now

Browse by Subject Area

Data Science: Inference and Modeling

Associated Schools

Harvard T.H. Chan School of Public Health

What you'll learn

Course description

Instructors

Rafael Irizarry

You may also like

Data Science: Probability

Data Science: Capstone

Principles, Statistical and Computational Tools for Reproducible Data Science