Perhaps the most popular data science methodologies come from machine learning. What distinguishes machine learning from other computer guided decision processes is that it builds prediction algorithms using data. Some of the most popular products that use machine learning include the handwriting readers implemented by the postal service, speech recognition, movie recommendation systems, and spam detectors. 

In this course, you will learn popular machine learning algorithms, principal component analysis, and regularization by building a movie recommendation system.

You will learn about training data, a set of data used to discover potentially predictive relationships and how the data can come in the form of the outcome we want to predict and features that we will use to predict this outcome. As you build the movie recommendation system, you will learn how to train algorithms using training data so you can predict the outcome for future datasets. You will also learn about overtraining and techniques to avoid it such as cross-validation. All of these skills are fundamental to machine learning.

Meet The Faculty

Rafael Irizarry

Rafael Irizarry

Professor of Biostatistics, T.H. Chan School of Public Health

Rafael Irizarry is a Professor of Biostatistics at the Harvard T.H. Chan School of Public Health and a Professor of Biostatistics and Computational Biology at the Dana Farber Cancer Institute. For the past 15 years, Dr. Irizarry’s research has focused on the analysis of genomics data. During this time, he has also has taught several classes, all related to applied statistics. Dr. Irizarry is one of the founders of the Bioconductor Project, an open source and open development software project for the analysis of genomic data. His publications related to these topics have been highly cited and his software implementations widely downloaded.

Course Provided By

Back To Top