Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

University of California, Berkeley

CS125x: Advanced Distributed Machine Learning with Apache Spark

University of California, Berkeley via edX

This course may be unavailable.

Overview

Building on the core ideas presented in Distributed Machine Learning with Spark, this course covers advanced topics for training and deploying large-scale learning pipelines. You will study state-of-the-art distributed algorithms for collaborative filtering, ensemble methods (e.g., random forests), clustering and topic modeling, with a focus on model parallelism and the crucial tradeoffs between computation and communication.

After completing this course, you will have a thorough understanding of the statistical and algorithmic principles required to develop and deploy distributed machine learning pipelines. You will further have the expertise to write efficient and scalable code in Spark, using MLlib and the spark.ml package in particular.

Taught by

Ameet Talwalkar and Jon Bates

Reviews

Start your review of CS125x: Advanced Distributed Machine Learning with Apache Spark

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.