subject
Intro

edX: Data Analysis for Social Scientists

 with  Esther Duflo and Sara Fisher Ellison

This course is part of the MITx MicroMasters program in Data, Economics, and Development Policy (DEDP). To audit this course, click “Enroll Now” in the green button at the top of this page.

To enroll in the MicroMasters track or to learn more about this program and how it integrates with MIT’s new blended Master’s degree, go to MITx’s MicroMasters portal.

This statistics and data analysis course will introduce you to the essential notions of probability and statistics. We will cover techniques in modern data analysis: estimation, regression and econometrics, prediction, experimental design, randomized control trials (and A/B testing), machine learning, and data visualization. We will illustrate these concepts with applications drawn from real world examples and frontier research. Finally, we will provide instruction for how to use the statistical package R and opportunities for students to perform self-directed empirical analyses.

This course is designed for anyone who wants to learn how to work with data and communicate data-driven findings effectively, but it is challenging. Students who are uncomfortable with basic calculus and algebra might struggle with the pace of the class.

Syllabus

MODULE 0: THE BASICS OF R

  • Introduction to the software R with exercises. Suggested resources for learning more on the web.

MODULE 1: INTRODUCTION

  • Introduction to the power of data and data analysis, overview of what will be covered in the course.

MODULE 2: FUNDAMENTALS OF PROBABILITY, RANDOM VARIABLES, DISTRIBUTIONS AND JOINT DISTRIBUTIONS

  • Basics of probability and introduction to random variables.
  • Discussion of distributions and joint distributions.

MODULE 3: GATHERING AND COLLECTING DATA, ETHICS, AND KERNEL DENSITY ESTIMATES

  • Introduction to collecting data through surveys, web scraping, and other data collection methods.
  • Principles and practical steps for protection of human subjects in research.
  • Discussion of kernel density estimates.

MODULE 4: JOINT, MARGINAL, AND CONDITIONAL DISTRIBUTIONS & FUNCTIONS OF RANDOM VARIABLES

  • Builds on the basics from module 2 to cover joint, marginal, and conditional distributions.
  • Similarly builds on the basics from module 2 to cover functions of random variables.

MODULE 5: MOMENTS OF A RANDOM VARIABLE, APPLICATIONS TO AUCTIONS, & INTRO TO REGRESSION

  • Discussion of moments of a distribution, expectation, and variance.
  • Application of some principles of probability to the analysis of auctions.
  • Basics of regression analysis.

MODULE 6: SPECIAL DISTRIBUTIONS, THE SAMPLE MEAN, CENTRAL LIMIT THEOREM, AND ESTIMATION

  • Discussion of properties of special distributions with several examples.
  • Statistics: Introduction to the sample mean, central limit theorem, and estimation.

MODULE 7: ASSESSING AND DERIVING ESTIMATORS- CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

  • Deriving and assessing estimators.
  • Constructing and interpreting confidence intervals.
  • Introduction to hypothesis testing.

MODULE 8: CAUSALITY, ANALYSING RANDOMIZED EXPERIMENTS, & NONPARAMETRIC REGRESSION

  • Understanding randomization in the context of experimentation.
  • Introduction to nonparametric regression techniques.

MODULE 9: SINGLE AND MULTIVARIATE LINEAR MODELS

  • In-depth discussion of the linear model and the multivariate linear model.

MODULE 10: PRACTICAL ISSUES IN RUNNING REGRESSIONS, AND OMITTED VARIABLE BIAS

  • Covariates, fixed effects, and other functional forms.
  • Introduction to regression discontinuity design.

MODULE 11: INTRO TO MACHINE LEARNING AND DATA VISUALIZATION

  • Introduction to the use of machine learning for prediction. Covers tuning and training.
  • Principles of data visualization with examples of well-crafted visual presentations of data.

MODULE 12: ENDOGENEITY, INSTRUMENTAL VARIABLES, AND EXPERIMENTAL DESIGN

  • Understanding the problem of endogeneity. Introduction to instrumental variables and two stage least squares, with a discussion of how to assess the validity of an instrument.
  • Discussion of how to design an effective experiment, followed by an example from Indonesia.
3 Student
reviews
Cost Free Online Course
Pace Upcoming
Subject Data Analysis
Provider edX
Language English
Hours 8-12 hours a week
Calendar 4 weeks long

Disclosure: To support our site, Class Central may be compensated by some course providers.

+ Add to My Courses
FAQ View All
What are MOOCs?
MOOCs stand for Massive Open Online Courses. These are free online courses from universities around the world (eg. Stanford Harvard MIT) offered to anyone with an internet connection.
How do I register?
To register for a course, click on "Go to Class" button on the course page. This will take you to the providers website where you can register for the course.
How do these MOOCs or free online courses work?
MOOCs are designed for an online audience, teaching primarily through short (5-20 min.) pre recorded video lectures, that you watch on weekly schedule when convenient for you.  They also have student discussion forums, homework/assignments, and online quizzes or exams.

3 reviews for edX's Data Analysis for Social Scientists

Write a review
7 months ago
profile picture
Mariana Marcondes dropped this course.
I was very excited about this course - its scope and the fact that it did not require any knowledge in statistics. That is not true: you should know some probability and statistics, otherwise you will not be able to keep up with the workload (or the classes, to be honest) and will drop out - like I did.

Will try again later, when I have gained some statistics knowledge.
Was this review helpful to you? YES | NO
a week ago
Paul F. Groepler Sr. is taking this course right now, spending 16 hours a week on it and found the course difficulty to be hard.
To say this class is thorough is an understatement. The lectures are extremely detailed, sometimes with additional detailed references(!), and it occasionally warrants going back and replaying one or two of the lectures before moving on. There is a good deal of statistics and probability review and training prior to ge Read More
To say this class is thorough is an understatement. The lectures are extremely detailed, sometimes with additional detailed references(!), and it occasionally warrants going back and replaying one or two of the lectures before moving on. There is a good deal of statistics and probability review and training prior to getting to the "methods" of this class (around week 8). I recommend this course as I cannot imagine a better, more thorough treatment for the topic, taught by some of the "best" there are out there today in Economics and Statistics.
Was this review helpful to you? YES | NO
0 out of 1 people found the following review useful
a year ago
Harunpehlivan audited this course.
Was this review helpful to you? YES | NO

Class Central

Get personalized course recommendations, track subjects and courses with reminders, and more.

Sign up for free