To support our site, Class Central may be compensated by some course providers.

Mining Massive Datasets

Stanford University via Stanford OpenEdx

students interested
  • Provider Stanford OpenEdx
  • Subject Data Mining
  • $ Cost Free Online Course
  • Session Self Paced
  • Language English
  • Certificate Certificate Available
  • Effort 8-10 hours a week
  • Start Date
  • Duration 7 weeks long

Taken this course? Share your experience with other students. Write review

Overview

We introduce the participant to modern distributed file systems and MapReduce, including what distinguishes good MapReduce algorithms from good algorithms in general.  The rest of the course is devoted to algorithms for extracting models and information from large datasets.  Participants will learn how Google's PageRank algorithm models importance of Web pages and some of the many extensions that have been used for a variety of purposes.  We'll cover locality-sensitive hashing, a bit of magic that allows you to find similar items in a set of items so large you cannot possibly compare each pair.  When data is stored as a very large, sparse matrix, dimensionality reduction is often a good way to model the data, but standard approaches do not scale well; we'll talk about efficient approaches.  Many other large-scale algorithms are covered as well, as outlined in the course syllabus.

Syllabus

Week 1:
MapReduce
Link Analysis -- PageRank

Week 2:
Locality-Sensitive Hashing -- Basics + Applications
Distance Measures
Nearest Neighbors
Frequent Itemsets

Week 3:
Data Stream Mining
Analysis of Large Graphs

Week 4:
Recommender Systems
Dimensionality Reduction

 

Week 5:
Clustering
Computational Advertising

Week 6:
Support-Vector Machines
Decision Trees
MapReduce Algorithms

Week 7:
More About Link Analysis --  Topic-specific PageRank, Link Spam.
More About Locality-Sensitive Hashing

Taught by

Jure Leskovec, Anand Rajaraman, Jeff Ullman and

Class Central Charts

Reviews for Stanford OpenEdx's Mining Massive Datasets
4.5 Based on 24 reviews

  • 5 stars 58%
  • 4 stars 38%
  • 3 star 4%
  • 2 star 0%
  • 1 star 0%

Did you take this course? Share your experience with other students.

Write a review
  • 1
Anonymous
4.0 4 years ago
Anonymous partially completed this course.
This is a course with interesting content but that is somewhat lacking in pedagogy.

The course has a lot of good content, notably from J.Ullman, but course sessions are very long, pedagogy is not optimal.

The course is a huge time investment with dense content all along the 7 weeks or so. If you can get over this it will be very rewarding but not everyone has that kind of time available.

That course would probably be better off cut in smaller chunks or offered as a self-paced course.

Also the fact the course doesn't offer verified certificate will make think twice before investing so much time in it.
12 people found
this review helpful
Was this review helpful to you? Yes
Anonymous
4.0 3 years ago
Anonymous completed this course.
I found the lecture to be of medium difficulty for the post-grad student and I would expect it to be rather hard for an undergrad.

The content is offered in two paces; the lectures of Prof. Ullman are hard to follow, as he browses quickly through many of the notions of the course and does not use enough/ explain in enough detail examples. Jure on the other hand uses a lot of examples and is easy to follow even from an undergrad.

Overall it is a time consuming course, expect to need around 6-8 hours per week. In the end, you do learn quite a few stuff and it is a good lecture to take. I am in favor of the instructors' choice of offering it as it is in Stanford.

Something that could help in the course is to split the content in 10 weeks instead of 7 and add mandatory programming exercises. They help a lot in learning stuck and remembering them for a long time.
3 people found
this review helpful
Was this review helpful to you? Yes
Hchan H
5.0 3 years ago
by Hchan completed this course, spending 10 hours a week on it and found the course difficulty to be hard.
Excellent course by the authors, covering the content of the book of the same name http://www.amazon.com/gp/product/1107077230. It is the MOOC version of http://cs246.stanford.edu. Many useful topics in large scale data processing algorithms are covered including mapreduce, pagerank, networks and graph analysis, streaming algorithms, just to mention a few. The level is advanced undergrad or postgrad, with some chapters covering topics in research papers published within the last decade.

Pacing is faster than most other MOOCs (I estimate about 2x the workload of a typical MOOC). But the material is very useful and rewarding. Exercises are comprehensive and the forums are very useful for checking your understanding.

1 person found
this review helpful
Was this review helpful to you? Yes
Aliaksandr B
4.0 3 years ago
by Aliaksandr completed this course, spending 7 hours a week on it and found the course difficulty to be hard.
Very interesting course covers a lot of topics. It is rather difficult and takes a lot of time (only lectures usually take around 3 hours/week and it's hard to watch them faster than 1.25x). The only disappointment for me was lectures taught by prof Ullman, was very hard to fallow his monotonic reading, other two lecturers have strong accents but were much more alive and understandable.
3 people found
this review helpful
Was this review helpful to you? Yes
You-cyuan J
4.0 4 years ago
by You-cyuan completed this course, spending 6 hours a week on it and found the course difficulty to be medium.
5 people found
this review helpful
Was this review helpful to you? Yes
Klaas N
5.0 3 years ago
Klaas completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
César A
5.0 3 years ago
César completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Adam H
4.0 a year ago
by Adam completed this course.
Was this review helpful to you? Yes
Ashlynn P
5.0 a year ago
by Ashlynn completed this course.
Was this review helpful to you? Yes
Lars A
4.0 2 years ago
by Lars audited this course.
Was this review helpful to you? Yes
Mark B
5.0 3 years ago
by Mark completed this course.
Was this review helpful to you? Yes
Vlad P
5.0 3 years ago
by Vlad completed this course.
Was this review helpful to you? Yes
Paweł K
4.0 2 years ago
by Paweł completed this course.
Was this review helpful to you? Yes
Mike R
5.0 2 years ago
Mike completed this course.
Was this review helpful to you? Yes
Colin K
4.0 3 years ago
by Colin completed this course.
Was this review helpful to you? Yes
José S
5.0 3 years ago
José completed this course.
Was this review helpful to you? Yes
Cristina C
5.0 3 years ago
by Cristina partially completed this course.
Was this review helpful to you? Yes
Alex I
3.0 2 years ago
Alex audited this course.
Was this review helpful to you? Yes
Niklas L
5.0 3 years ago
by Niklas completed this course.
Was this review helpful to you? Yes
Valentin K
5.0 2 years ago
by Valentin completed this course.
Was this review helpful to you? Yes
  • 1

Class Central

Get personalized course recommendations, track subjects and courses with reminders, and more.

Sign up for free

Never stop learning Never Stop Learning!

Get personalized course recommendations, track subjects and courses with reminders, and more.