Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

University of California, Berkeley

CS190.1x: Scalable Machine Learning

University of California, Berkeley via edX

This course may be unavailable.

Overview

Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability and optimization. Learning algorithms enable a wide range of applications, from everyday tasks such as product recommendations and spam filtering to bleeding edge applications like self-driving cars and personalized medicine. In the age of ‘Big Data,’ with datasets rapidly growing in size and complexity and cloud computing becoming more pervasive, machine learning techniques are fast becoming a core component of large-scale data processing pipelines.
 
This course introduces the underlying statistical and algorithmic principles required to develop scalable real-world machine learning pipelines. We present an integrated view of data processing by highlighting the various components of these pipelines, including exploratory data analysis, feature extraction, supervised learning, and model evaluation. You will gain hands-on experience applying these principles using Apache Spark, a cluster computing system well-suited for large-scale machine learning tasks. You will implement scalable algorithms for fundamental statistical models (linear regression, logistic regression, matrix factorization, principal component analysis) while tackling key problems from domains such as online advertising and cognitive neuroscience.
 
This self-assessment document provides a short quiz, as well as online resources that review the relevant background material. 

Taught by

Ameet Talwalkar

Reviews

4.5 rating, based on 31 Class Central reviews

Start your review of CS190.1x: Scalable Machine Learning

  • Profile image for Jevgeni Martjushev
    Jevgeni Martjushev
  • Scalable Machine Learning is a 5-week distributed machine learning course offered by UC Berkeley through the edX platform. It is a follow up to another UC Berkely course: Introduction to Big Data with Apache Spark. Although the first course is not…
  • Overall a good course, that is worthwhile spending the time on, if you want to get a basic introduction to solving machine learning problems using Apache Spark. As with the precursor, CS100.1x, the lecture videos and quizzes are pretty light on act…
  • Anonymous
    The machine learning algorithms are explained in reasonably granular level, and easy to follow. The labs are the highlight. I learnt a lot from doing. Thanks for putting this course together.
  • Very well explained machine learning using Spark from scratch. Therefore a good introductory course. Not too many details covered, probably due to time limitation. Hope they make a sequel.
  • Anonymous
  • Vlad Podgurschi
  • Profile image for Lace Lofranco
    Lace Lofranco
  • C M Chan
  • Liang Lu
  • V M
  • Shuang Wu
  • Peter Mosoni
  • Rogier Werschkull
  • Maurits Doorn
  • Profile image for Sauro Grandi
    Sauro Grandi

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.