subject

edX: CS190.1x: Scalable Machine Learning

 with  Ameet Talwalkar
Class Central Course Rank
#3 in Subjects > Computer Science > Machine Learning

Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability and optimization. Learning algorithms enable a wide range of applications, from everyday tasks such as product recommendations and spam filtering to bleeding edge applications like self-driving cars and personalized medicine. In the age of ‘Big Data,’ with datasets rapidly growing in size and complexity and cloud computing becoming more pervasive, machine learning techniques are fast becoming a core component of large-scale data processing pipelines.
 
This course introduces the underlying statistical and algorithmic principles required to develop scalable real-world machine learning pipelines. We present an integrated view of data processing by highlighting the various components of these pipelines, including exploratory data analysis, feature extraction, supervised learning, and model evaluation. You will gain hands-on experience applying these principles using Apache Spark, a cluster computing system well-suited for large-scale machine learning tasks. You will implement scalable algorithms for fundamental statistical models (linear regression, logistic regression, matrix factorization, principal component analysis) while tackling key problems from domains such as online advertising and cognitive neuroscience.
 
This self-assessment document provides a short quiz, as well as online resources that review the relevant background material. 
31 Student
reviews
Cost Free Online Course
Pace Finished
Provider edX
Language English
Certificates Certificate Available
Calendar 5 weeks long
+ Add to My Courses
In-Depth Review
If you’ve never done any machine learning before, then this course will surely whet your appetite and encourage you to dive deeper into other popular algorithms. Read Review
Learn Data Analysis udacity.com

Learn to become a Data Analyst. Job offer guaranteed or get a full refund.

Advertisement
Best Machine Learning Courses class-central.com

Every single Machine Learning course on the internet, ranked by your reviews

Advertisement
FAQ View All
What are MOOCs?
MOOCs stand for Massive Open Online Courses. These are free online courses from universities around the world (eg. Stanford Harvard MIT) offered to anyone with an internet connection.
How do I register?
To register for a course, click on "Go to Class" button on the course page. This will take you to the providers website where you can register for the course.
How do these MOOCs or free online courses work?
MOOCs are designed for an online audience, teaching primarily through short (5-20 min.) pre recorded video lectures, that you watch on weekly schedule when convenient for you.  They also have student discussion forums, homework/assignments, and online quizzes or exams.

31 reviews for edX's CS190.1x: Scalable Machine Learning

Write a review
1 out of 1 people found the following review useful
2 years ago
Gregory J Hamel ( Life Is Study) completed this course and found the course difficulty to be medium.
Scalable Machine Learning is a 5-week distributed machine learning course offered by UC Berkeley through the edX platform. It is a follow up to another UC Berkely course: Introduction to Big Data with Apache Spark. Although the first course is not a strict perquisite, Salable Machine Learning uses the same virtual ma Read More


Scalable Machine Learning is a 5-week distributed machine learning course offered by UC Berkeley through the edX platform. It is a follow up to another UC Berkely course: Introduction to Big Data with Apache Spark. Although the first course is not a strict perquisite, Salable Machine Learning uses the same virtual machine and even has some overlap with the homework labs, so it is beneficial to take Introduction to Big Data first. Scalable Machine Learning teaches distributed machine learning basics using Pyspark, Apache Spark’s Python API. Basic proficiency with Python is necessary to pass the course and some exposure to algorithms and machine learning concepts is helpful. Course evaluation is based primarily on 5 labs distributed as iPython notebooks.

The first two weeks of the course cover machine learning basics and introduce Apache Spark. For students already familiar with machine learning basics who took Introduction to Big Data, there’s not much new to learn during first two weeks. Week 2 is essentially an exact clone of week 2 of the intro to big data course, including the lab assignment. The final 3 weeks have meatier lecture content and longer labs, each covering a different machine learning technique--linear regression, logistic regression and principal component analysis.

The lecture content is clean and the lecturer speaks clearly. His delivery isn’t perfect, but the only real purpose of the lectures is to serve as background information for the meat of the course: the labs. Each lab is a lengthy iPython notebook with several sections leading you through the process of creating a pipeline for running a machine learning algorithm with Pyspark. Much of the code you need is provided for you, but writing the key functions and data transformations necessary to complete the labs can still be time consuming. Little things like an ambiguous instruction or uncaught error you made earlier in the assignment can result in bugs that take a while to squash. Despite occasional frustrations, the labs do a good job interspersing instruction with practical, hands-on learning.

Scalable Machine Learning is a quality introduction to machine learning with Pyspark that focuses on labs over lectures. The lectures could be better and some of the instructions and error checks in the labs could be more comprehensive, but this is a great course for those looking to learn by doing.

I give Scalable Machine Learning 4 out of 5 stars: Very Good.
Was this review helpful to you? YES | NO
2 years ago
Martin Strandbygaard completed this course, spending 4 hours a week on it and found the course difficulty to be medium.
Overall a good course, that is worthwhile spending the time on, if you want to get a basic introduction to solving machine learning problems using Apache Spark. As with the precursor, CS100.1x, the lecture videos and quizzes are pretty light on actual content and nothing spectacular. However, as with the precursor I f Read More
Overall a good course, that is worthwhile spending the time on, if you want to get a basic introduction to solving machine learning problems using Apache Spark.

As with the precursor, CS100.1x, the lecture videos and quizzes are pretty light on actual content and nothing spectacular. However, as with the precursor I found the assignments really well structured, interesting, and informative. They use IPython notebook which I found to be a really awesome format for this kind of course and assignments.

The course is not heavy on the mathematics of machine learning algorithms, and it's introductions to the used algorithms is very basic. For this, something like Machine Learning on Coursera is a much better course.

What this course does is give you a good introduction to solving some actual problems using a selection of machine learning algorithms with Apache Spark.

I found some of the assignments for this course to be easier than some of the later assignments for the introduction course CS100.1x

I had a hard time deciding if this course should get 3 or 4 stars. But ended up with 3 stars. The assignments definitely rate 4 stars, and I think that is the most important aspect of the course. I think the lecture videos only rate 3 stars. For comparison, watch the lectures from Machine Learning on Coursera which I believe rate 5 stars.
Was this review helpful to you? YES | NO
2 years ago
Gaurabh completed this course, spending 5 hours a week on it and found the course difficulty to be medium.
Very well explained machine learning using Spark from scratch. Therefore a good introductory course. Not too many details covered, probably due to time limitation. Hope they make a sequel.
Was this review helpful to you? YES | NO
2 years ago
profile picture
Anonymous is taking this course right now.
The machine learning algorithms are explained in reasonably granular level, and easy to follow. The labs are the highlight. I learnt a lot from doing. Thanks for putting this course together.
Was this review helpful to you? YES | NO
2 years ago
V M completed this course.
Was this review helpful to you? YES | NO
2 years ago
profile picture
Shuang Wu completed this course.
Was this review helpful to you? YES | NO
2 years ago
Tabish Sada is taking this course right now.
Was this review helpful to you? YES | NO
a year ago
Mark Henry Butler completed this course.
Was this review helpful to you? YES | NO
2 years ago
Lena S completed this course.
Was this review helpful to you? YES | NO
9 months ago
Davide Madrisan completed this course.
Was this review helpful to you? YES | NO
a year ago
Colin Khein completed this course.
Was this review helpful to you? YES | NO
a year ago
profile picture
Jevgeni Martjushev completed this course.
Was this review helpful to you? YES | NO
a year ago
profile picture
Jevgeni Martjushev completed this course.
Was this review helpful to you? YES | NO
2 years ago
profile picture
Rogier Werschkull audited this course.
Was this review helpful to you? YES | NO
2 years ago
Lace Lofranco is taking this course right now.
Was this review helpful to you? YES | NO
a year ago
Shayan Fahimi completed this course.
Was this review helpful to you? YES | NO
2 years ago
profile picture
C M Chan completed this course.
Was this review helpful to you? YES | NO
2 years ago
Sauro Grandi completed this course.
Was this review helpful to you? YES | NO
2 months ago
Atila Romero completed this course.
Was this review helpful to you? YES | NO
2 years ago
Prakhar Srivastav completed this course.
Was this review helpful to you? YES | NO
2 years ago
Vlad Podgurschi completed this course, spending 6 hours a week on it and found the course difficulty to be medium.
Was this review helpful to you? YES | NO
2 years ago
profile picture
Anonymous completed this course.
Was this review helpful to you? YES | NO
2 years ago
profile picture
Maurits Doorn is taking this course right now.
Was this review helpful to you? YES | NO
2 years ago
Chema Cortés completed this course.
Was this review helpful to you? YES | NO
2 years ago
Peter Mosoni completed this course.
Was this review helpful to you? YES | NO
2 years ago
Dmitry Nikulin completed this course.
Was this review helpful to you? YES | NO
2 years ago
profile picture
Gerhard Gasseling completed this course.
Was this review helpful to you? YES | NO
2 years ago
Igor Subbotin is taking this course right now.
Was this review helpful to you? YES | NO
2 years ago
Sergiy Matusevych completed this course.
Was this review helpful to you? YES | NO
a year ago
Gregory Deangelis completed this course.
Was this review helpful to you? YES | NO
2 years ago
Liang Lu is taking this course right now.
Was this review helpful to you? YES | NO

Write a review

How would you rate this course? *
How much of the course did you finish? *
Review
Create Review