One of the broad goals of data science is examining raw data with the purpose of identifying its structure and trends, and of deriving conclusions and hypotheses from it. In the modern world awash with data, data analytics is more important than ever to fields ranging from biomedical research, space and weather science, finance, business operations and production, to marketing and social media applications. This course introduces various statistical learning methods and their applications. The R programming language, a very popular and powerful platform for scientific and statistical analysis and visualization, is introduced and used throughout the course. We discuss the fundamentals of statistical testing and learning, and cover topics of linear and non-linear regression, clustering and classification, support vector machines, and decision trees. The datasets used in the examples are drawn from diverse domains such as finance, genomics, and customer sales and survey data.
Harvard Extension School
Harvard Division of Continuing Education