Data Science

Faculty Chairs: Matthew Blackwell, Kosuke Imai, & Gary King.  
 

Overview: Data science is a new field that emerged in the late 2000s as new technology made gathering and analyzing “big data” possible (Davenport & Patil 2012). Combining skills in computer programming, structuring data, and statistical analysis, data science has grown rapidly, with new academic journals, graduate degrees, and research networks. Harvard now has data science programs in multiple concentrations. The Government Department’s data science program stands out for its focus on teaching students to use these skills to solve real-world problems.

A number of alumni from the Government department’s undergraduate and graduate programs are going into the data science industry. These alumni are working at top tech companies (Facebook, Google, Netflix) and in data journalism, data-driven political consultancies, and impact evaluation non-profits.
 

Requirements:

In response to student feedback, we have revised our sequence of undergraduate data science courses.  In place of the 3-course sequence we offered in previous years (Gov 50, Gov 51 & Gov 52), we have streamlined the overlapping content in those courses to provide a 2-course sequence that provides a solid foundation for work in data science: Gov 50 and Gov 51.  This sequence will be comparable to the other introductory data science courses offered by other departments, but with a specific focus on political and policy questions. 

Gov 50 covers the fundamentals of data science as applied to the social sciences: visualization, wangling, causal inference, prediction, and inference. All the while you will learn how to communicate your findings to a broad audience and how to use the professional tools of the trade such as R, tidyverse, and GitHub. Each student will complete a final project to showcase their acquired skills. No previous experience with statistics or statistical computing required.

The new Gov 51 is designed to highlight the types of methods students would be most likely to encounter in both research and industry. This course will go deeper into prediction and modeling in linear and nonlinear models and causal inference approaches (instrumental variables, regression discontinuity designs, difference-in-differences). This will give students a suite of statistical and computational tools to use in their research projects, senior theses, or industry jobs.

Students then choose the third and fourth courses in the track from our data science electives, applying and extending the analytical techniques learned in the foundation courses.

Data Science Electives Offered in 2022-23:

  • Gov 1008: Introduction to Geographical Information Systems
  • Gov 1009: Advanced Geographical Information Systems Workshop
  • Gov 1347: Election Analytics
  • Gov 2001: Quantitative Social Science Methods I
  • Gov 2002: Quantitative Social Science Methods II
  • Gov 2003: Causal Inference
  • API 211: Program Evaluation and Education Policy (at the Harvard Kennedy School)
  • API 222: Machine Learning and Big Data Analytics (at the Harvard Kennedy School)

The Government Department will entertain petitions to count courses outside the list above for Data Science credit. Please fill out this form and make sure to note in the comments section that you are requesting to count the course as one of the 4 required classes for the Data Science program. 

Advising: Prof. Matthew Blackwell is the adviser for the data science program. Please contact him with any questions or to discuss your interests in joining the program.

Applying: Concentrators should complete the Data Science plan of study supplement, review it with their concentration adviser, and submit the completed form to the Government Department Undergraduate Manager, Karen Kaletka