To support our site, Class Central may be compensated by some course providers.

Big Data Analysis with Apache Spark

University of California, Berkeley via edX

students interested
  • Provider edX
  • Subject Big Data
  • $ Cost Free Online Course
  • Session Finished
  • Language English
  • Effort 5-10 hours a week
  • Start Date
  • Duration 4 weeks long

Taken this course? Share your experience with other students. Write review

Overview

Organizations use their data to support and influence decisions and build data-intensive products and services, such as recommendation, prediction, and diagnostic systems. The collection of skills required by organizations to support these functions has been grouped under the term ‘data science’.

This statistics and data analysis course will attempt to articulate the expected output of data scientists and then teach students how to use PySpark (part of Spark) to deliver against these expectations. The course assignments include log mining, textual entity recognition, and collaborative filtering exercises that teach students how to manipulate data sets using parallel processing with PySpark.

This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), and previous experience with Spark equivalent to Introduction to Apache Spark, is required.

Taught by

Anthony D. Joseph

Class Central Charts

Help Center

Most commonly asked questions about EdX EdX

Reviews for edX's Big Data Analysis with Apache Spark
4.3 Based on 43 reviews

  • 5 stars 44%
  • 4 stars 44%
  • 3 stars 12%
  • 2 star 0%
  • 1 star 0%

Did you take this course? Share your experience with other students.

Write a review
  • 1
Gregory S
4.0 3 years ago
by Gregory completed this course.


CS100.1x Introduction to Big Data with Apache Spark is a 5-week intro to distributed computing offered by UC Berkeley through the edX MOOC platform focused on teaching students how to perform large-scale computation using Apache Spark. The assignments use PySpark, Spark’s Python API, so some familiarity with Python programming is necessary. You don’t need prior exposure to big data or distributed computing to take the course. Grades are based on four programming labs (80%), easy comprehension questions that allow unlimited attempts (12%) and setup of the course virtual machine us…
8 people found
this review helpful
Was this review helpful to you? Yes
Martin S
4.0 3 years ago
by Martin completed this course, spending 4 hours a week on it and found the course difficulty to be medium.
Overall a good course, that is worthwhile spending the time on, if you want to get familiar with spark and the map-reduce programming model.

The lecture videos and quizzes are pretty lightweight, and nothing spectacular. However, I found the assignments really well structured, interesting, and informative. They use IPython notebook which I found to be a really awesome format for this kind of course and assignments.

The course is not heave on mathematics and statistics, but the assignments will challenge you to really understand the stated problems, and the map-reduce programming model, to successfully complete them.

2 people found
this review helpful
Was this review helpful to you? Yes
Anoop T
5.0 3 years ago
by Anoop is taking this course right now and found the course difficulty to be medium.
It was nice course. I loved it.

Good Intro PySpark API.

Nice set of Problem set.

As a part of it, if you are lucky you will get access to Databricks clouds
1 person found
this review helpful
Was this review helpful to you? Yes
Wendao L
3.0 3 years ago
by Wendao is taking this course right now.
Slightly disappointed by the content, not very informative. if u wanna learn more about spark, u definitely need explore more material.
1 person found
this review helpful
Was this review helpful to you? Yes
Gaurav S
3.0 3 years ago
by Gaurav is taking this course right now, spending 4 hours a week on it and found the course difficulty to be medium.
Lectures are very light in content and disappointing but the labs are good and do require students to investigate and complete them.
1 person found
this review helpful
Was this review helpful to you? Yes
Charlie S
4.0 3 years ago
by Charlie completed this course, spending 5 hours a week on it and found the course difficulty to be hard.
This is an excellent course for beginners to the world of Spark but it would be a good idea to have some programming knowledge in Python as well as basic understanding of what big data means. The problem sets are organized methodically with much explanation so even if you don't know much statistics you can still follow with the programming. I'm no statistician but managed to go through all problem sets with few mistakes. It certainly was fun on top of being educational and informative.
Was this review helpful to you? Yes
Shuang W
5.0 3 years ago
Shuang completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Tabish S
4.0 3 years ago
by Tabish audited this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Gabriel T
5.0 3 years ago
Gabriel is taking this course right now.
0 person found
this review helpful
Was this review helpful to you? Yes
V M
4.0 3 years ago
by V completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Klaas N
3.0 3 years ago
Klaas audited this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Prakhar S
5.0 3 years ago
by Prakhar completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Rogier W
4.0 3 years ago
Rogier completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Karri S
4.0 3 years ago
Karri completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Hamza R
5.0 3 years ago
Hamza is taking this course right now.
0 person found
this review helpful
Was this review helpful to you? Yes
Kuronosuke K
5.0 3 years ago
Kuronosuke is taking this course right now.
0 person found
this review helpful
Was this review helpful to you? Yes
Guilherme S
5.0 3 years ago
by Guilherme completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Vlad P
4.0 3 years ago
by Vlad completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Chema C
5.0 3 years ago
by Chema completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Rakesh R
4.0 3 years ago
by Rakesh is taking this course right now.
0 person found
this review helpful
Was this review helpful to you? Yes
  • 1

Class Central

Get personalized course recommendations, track subjects and courses with reminders, and more.

Sign up for free

Never stop learning Never Stop Learning!

Get personalized course recommendations, track subjects and courses with reminders, and more.