Data Science

Faculty Chairs: Matthew Blackwell, Kosuke Imai, & Gary King.  

 

Overview: Data science is a new field that emerged in the late 2000s as new technology made gathering and analyzing “big data” possible (Davenport & Patil 2012). Combining skills in computer programming, structuring data, and statistical analysis, data science has grown rapidly, with new academic journals, graduate degrees, and research networks. Harvard now has data science programs in multiple concentrations. The Government Department’s data science program stands out for its focus on teaching students to use these skills to solve real-world problems.

 

A number of alumni from the Government department’s undergraduate and graduate programs are going into the data science industry. These alumni are working at top tech companies (Facebook, Google, Netflix) and in data journalism, data-driven political consultancies, and impact evaluation non-profits.

Requirements: In addition to the usual requirements for the Government concentration, students in the data science track take four methods courses. (One fulfills the research methods requirement for all concentrators, while the others are usually concentration electives.) All four courses must be taken for a letter grade (except for courses taken during Spring 2020, when emergency satisfactory/unsatisfactory grading was in effect).

Three courses are recommended:

  • Gov 50: Data
  • Gov 51: Data Analysis and Politics
  • Gov 52: Models

Most students should start with Gov 50 and then move to 51 and 52 in that order. Some students who feel comfortable with math and/or computing may skip Gov 50 and begin with Gov 51. Any of these will satisfy the Government Department's methods requirement and the Harvard College Quantitative Reasoning with Data requirement.  Undergrads who have taken 51 and/or 52 regularly take the introductory course in our graduate methods sequence, Gov 2001.

For more information on these courses, see https://projects.iq.harvard.edu/government-methods/undergraduate-courses

The final course for the data science program is an elective from the following list of courses. If students have sufficient training to skip any of the courses in the sequence above, they can take additional electives from the list below to complete the data science program.

Elective Courses:

  • Gov 1003: Data Science for Politics
  • Gov 1005: Big Data
  • Gov 1008: Introduction to Geographical Information Systems
  • Gov 1009: Advanced Geographical Information Systems Workshop
  • Gov 1347: Election Analytics
  • Gov 1372: Political Psychology
  • Gov 2001: Quantitative Social Science Methods I (graduate seminar)
  • Gov 2002: Quantitative Social Science Methods II (graduate seminar)
  • Gov 2003: Causal Inference (graduate seminar)
  • Gov 2017: Applied Bayesian Statistics for the Social Sciences (graduate seminar)
  • Gov 2018: Applied Machine Learning for the Social Sciences (graduate seminar)

For most students, we recommend completing the program with Gov 2001, a graduate-level course. Our three core undergraduate methods classes are designed to prepare any student to be ready for Gov 2001, and the methods faculty can provide students with additional resources for this transition upon request. 

Advising: Prof. Matthew Blackwell is the adviser for the data science program. Please contact him with any questions or to discuss your interests in joining the program.

Applying: Concentrators should complete the Data Science plan of study supplement, review it with their concentration adviser, and submit the completed form to the Government Department Undergraduate Coordinator, Karen Kaletka