To support our site, Class Central may be compensated by some course providers.

Data Analysis for Social Scientists

Massachusetts Institute of Technology via edX

students interested
  • Provider edX
  • Subject Data Analysis
  • $ Cost Free Online Course
  • Session Upcoming
  • Language English
  • Effort 12-14 hours a week
  • Start Date
  • Duration 11 weeks long

Taken this course? Share your experience with other students. Write review

Overview

This course is now part of two independent MITx MicroMasters programs. For both MicroMasters programs, learners will need to first enroll in and pass this course. However, each program will then require different final assessments for a course certificate toward the full MicroMasters credential:

1. MicroMasters in Data, Economics, and Development Policy (DEDP).

To pursue the DEDP MicroMasters credential, pass this course, create a MicroMasters in DEDP profile, and pass an additional in-person proctored exam.

To learn more about the DEDP program and how it integrates with MIT’s new blended Master’s degree, please visit https://micromasters.mit.edu/ dedp/.

 
2.  MicroMasters in Statistics and Data Science (SDS).
To pursue the SDS MicoMasters credential, pass this course, and enroll in and pass the final assessment at 14.310Fx Data Analysis in Social Sciences-Assessment on EdX.

Complete all 4 courses and the capstone exam in the SDS program to accelerate your path towards graduate studies at MIT or other universities. To learn more, please visit https://micromasters.mit.edu/ ds/.

This statistics and data analysis course will introduce you to the essential notions of probability and statistics. We will cover techniques in modern data analysis: estimation, regression and econometrics, prediction, experimental design, randomized control trials (and A/B testing), machine learning, and data visualization. We will illustrate these concepts with applications drawn from real world examples and frontier research. Finally, we will provide instruction for how to use the statistical package R and opportunities for students to perform self-directed empirical analyses.

This course is designed for anyone who wants to learn how to work with data and communicate data-driven findings effectively.

Course Previews:

Our course previews are meant to give prospective learners the opportunity to get a taste of the content and exercises that will be covered in each course. If you are new to these subjects, or eager to refresh your memory, each course preview also includes some available resources. These resources may also be useful to refer to over the course of the semester. 

A score of 60% or above in the course previews indicates that you are ready to take the course, while a score below 60% indicates that you should further review the concepts covered before beginning the course.

Please use the this link to access the course preview.

Syllabus

MODULE 0: THE BASICS OF R
  • Introduction to the software R with suggested resources.
MODULE 1: INTRODUCTION
  • Introduction to the power of data and data analysis, and course overview
MODULE 2: FUNDAMENTALS OF PROBABILITY, RANDOM VARIABLES, DISTRIBUTIONS AND JOINT DISTRIBUTIONS
  • Basics of probability and introduction to random variables
  • Distributions and joint distributions
MODULE 3: GATHERING AND COLLECTING DATA, ETHICS, AND KERNEL DENSITY ESTIMATES
  • Collecting data through surveys, web scraping, and other data collection methods
  • Principles and practical steps for protection of human subjects in research
  • Discussion of kernel density estimates
MODULE 4: JOINT, MARGINAL, AND CONDITIONAL DISTRIBUTIONS & FUNCTIONS OF RANDOM VARIABLES
  • Further exploration on joint, marginal, and conditional distributions
  • Deep dive intofunctions of random variables
MODULE 5: MOMENTS OF A RANDOM VARIABLE, APPLICATIONS TO AUCTIONS, & INTRO TO REGRESSION
  • Moments of a distribution, expectation, and variance
  • Applying principles of probability to the analysis of auctions
  • Basics of regression analysis
MODULE 6: SPECIAL DISTRIBUTIONS, THE SAMPLE MEAN, CENTRAL LIMIT THEOREM, AND ESTIMATION
  • The properties of special distributions with several examples
  • Statistics: Introduction to the sample mean, central limit theorem, and estimation
MODULE 7: ASSESSING AND DERIVING ESTIMATORS- CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
  • Deriving and assessing estimators
  • Constructing and interpreting confidence intervals
  • Introduction to hypothesis testing
MODULE 8: CAUSALITY, ANALYSING RANDOMIZED EXPERIMENTS, & NONPARAMETRIC REGRESSION
  • Understanding randomization in the context of experimentation
  • Introduction to nonparametric regression techniques
MODULE 9: SINGLE AND MULTIVARIATE LINEAR MODELS
  • In-depth discussion of the linear model and the multivariate linear model
MODULE 10: PRACTICAL ISSUES IN RUNNING REGRESSIONS, AND OMITTED VARIABLE BIAS
  • Covariates, fixed effects, and other functional forms
  • Introduction to regression discontinuity design
MODULE 11: INTRO TO MACHINE LEARNING AND DATA VISUALIZATION
  • Use of machine learning for prediction, covers tuning and training
  • Principles of data visualization
MODULE 12: ENDOGENEITY, INSTRUMENTAL VARIABLES, AND EXPERIMENTAL DESIGN
  • Understanding endogeneity problems  and an introduction to instrumental variables and two stage least squares, and assessing the validity of an instrument
  • Designing an effective experiment with a case study from Indonesia

Taught by

Esther Duflo and Sara Fisher Ellison

Tags

Help Center

Most commonly asked questions about EdX EdX

Reviews for edX's Data Analysis for Social Scientists
2.9 Based on 8 reviews

  • 5 stars 25%
  • 4 star 0%
  • 3 stars 25%
  • 2 stars 38%
  • 1 star 13%

Did you take this course? Share your experience with other students.

Write a review
  • 1
James S
3.0 4 weeks ago
by James completed this course, spending 12 hours a week on it and found the course difficulty to be medium.
Writing a review for this course is hard. The content of the course is ambitious and the promise is considerable. I am grateful that the Professors and MIT have made this course available online. That being said, I find it hard to recommend this course.

As an overview, each week contains 2-3 lectures, mostly probability mixed with some stats, with 'finger exercises (FEs)' at the end of each lecture segment to test knowledge. At the end of each week there is a more in-depth set of questions covering all the material and some more practical aspects with R. Here is a quick sum…
Was this review helpful to you? Yes
Seylan N
2.0 2 weeks ago
Seylan completed this course.
I did not enjoy this course at all. Here are the main reasons why:

1) Lecture videos. The lecturers themselves might be masters in their respective fields, but the lecture videos are not suitable for an online course. The videos are literally from the actual MIT course. The most annoying thing about the videos is that the lecturers sometimes make reference to something shown on the board or the screen, but the video just shows one thing at a time: either the view of the lecturer or what is being projected on the screen. It feels weird and I don't think they really thought how stud…
Was this review helpful to you? Yes
Dileep N
2.0 2 months ago
Dileep completed this course, spending 12 hours a week on it and found the course difficulty to be medium.
tl;dr - poorly put together MOOC trying to cram too many things into one course. Doesn't leave you with a lot of confidence that you can analyse data independently on big projects. Confusing approach without concrete examples and demos of how to run full tests in practice. I found it a frustrating experience.

This course tries to cover a lot of ground in very little time, so, the treatment is very superficial. In the end, what you're left with is a hodge-podge of techniques and methods with no real intuitive understanding of what they mean and without a solid understanding of how…
Was this review helpful to you? Yes
Paul S
5.0 a year ago
by Paul is taking this course right now, spending 16 hours a week on it and found the course difficulty to be hard.
To say this class is thorough is an understatement. The lectures are extremely detailed, sometimes with additional detailed references(!), and it occasionally warrants going back and replaying one or two of the lectures before moving on. There is a good deal of statistics and probability review and training prior to getting to the "methods" of this class (around week 8). I recommend this course as I cannot imagine a better, more thorough treatment for the topic, taught by some of the "best" there are out there today in Economics and Statistics.
Was this review helpful to you? Yes
Mariana M
3.0 2 years ago
Mariana is taking this course right now.
I was very excited about this course - its scope and the fact that it did not require any knowledge in statistics. That is not true: you should know some probability and statistics, otherwise you will not be able to keep up with the workload (or the classes, to be honest) and will drop out - like I did.

Will try again later, when I have gained some statistics knowledge.
Was this review helpful to you? Yes
Antonello L
2.0 3 months ago
by Antonello is taking this course right now, spending 15 hours a week on it and found the course difficulty to be easy.
It is a strange "mixed beast" course.

At the end I don't know what this course is good for. Too many different things (prob theory, programming in R, Statistics..) approached superficially, and most important, without even give the "intuitions" behind what it is used..

Not much added value, and honestly I don't know why it has been added to the list of required courses for the new MIT MicroMaster program in data science.
Was this review helpful to you? Yes
Ayse N
1.0 6 months ago
Ayse is taking this course right now, spending 15 hours a week on it and found the course difficulty to be very hard.
There is just too much theory in the course. I only wanted to learn some Data Analysis and possibly Machine Learning. But no, it just doesn't happen. I was excited about this course, but there is little practical value compared to the effort you need to spend on the coursework. Unless you are already good at multivariable calculus and probability theory at the level of this course: https://www.edx.org/course/probability-the-science-of-uncertainty-and-data

then you will probably feel the same frustration as me.
Was this review helpful to you? Yes
Harunpehlivan H
5.0 2 years ago
by Harunpehlivan completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
  • 1

Class Central

Get personalized course recommendations, track subjects and courses with reminders, and more.

Sign up for free

Never stop learning Never Stop Learning!

Get personalized course recommendations, track subjects and courses with reminders, and more.