Prerequisite: Students should work through one or more R tutorials prior or during the first weeks of class due to the short introduction to R presented in the class. Some resources can be found at the following: https://www.rstudio.com/online-learning/#R, https://cran.r-project.org/manuals.html, or http://www.statmethods.net.
Provides an introduction to supervised statistical learning techniques such as decision trees, random forests and boosting and discusses their potential application in the social sciences. These methods focus on predicting an outcome Y based on some data-driven function f(X) and therefore facilitate new research perspectives in comparison with traditional regression models, which primarily focus on causation. Predictive methods also provide a valuable extension to the empirical social scientists' toolkit as new data sources become more prominent. In addition to introducing supervised learning methods, the course will include practical sessions to exemplify how to tune and evaluate prediction models using the statistical programming language R.