Introduction

The demand for skilled data science practitioners in industry, academia, and government is rapidly growing. The HarvardX Data Science Series prepares you with the necessary knowledge base and skills to tackle real-world data analysis challenges. The series covers concepts such as probability, inference, regression and machine learning and helps you develop a skill set that includes R programming, data wrangling with dplyr, data visualization with ggplot2, file organization with UNIX/Linux, version control with git and GitHub, and reproducible document preparation with RStudio. In the R Basics course, we learn the basic building blocks of R. As done in all our courses, we use motivating case studies, we ask specific questions, and learn by answering these through data analysis. Our assessments use code checking technology that will permit you to get hands-on practice during the courses.

Throughout the series, we will be using the R software environment. You will learn R, statistical concepts, and data analysis techniques simultaneously. In this course, we will introduce the necessary basic R syntax to get you going. However, rather than cover every R skill you need, we introduce just enough so you can continue learning in the next courses, which will provide more in depth coverage. We believe that you can better retain R knowledge when you learn it to solve a specific problem. The motivating question in this course relates to crime in the United States and we provide a relevant dataset. You will learn some basic R skills to permit us to answer specific questions about differences across the different states.

HarvardX has partnered with DataCamp for all assignments. This allows students to program directly in a browser-based interface. You will not need to download any special software, but an up-to-date browser is recommended.

What you'll learn:

  • Introduction to basic R syntax
  • Basic R programming concepts such as data types, vectors arithmetic, and indexing
  • How to perform operations in R including sorting, creating or importing data frame creation, basic data wrangling and making plots
  • How to perform basic programming with R

Meet The Faculty

Rafael Irizarry

Rafael Irizarry

Professor of Biostatistics, T.H. Chan School of Public Health

Rafael Irizarry is a Professor of Biostatistics at the Harvard T.H. Chan School of Public Health and a Professor of Biostatistics and Computational Biology at the Dana Farber Cancer Institute. For the past 15 years, Dr. Irizarry’s research has focused on the analysis of genomics data. During this time, he has also has taught several classes, all related to applied statistics. Dr. Irizarry is one of the founders of the Bioconductor Project, an open source and open development software project for the analysis of genomic data. His publications related to these topics have been highly cited and his software implementations widely downloaded.

Course Provided By

Back To Top