Big Data

University of California, San Diego via Coursera Specialization

Go to class Write review

Details

Go to class

Provider

Coursera Specialization
Pricing

Paid Course
Languages

English
Certificate

Certificate Available
Duration & workload

35 weeks, 3 hours a week
Level

Beginner

Found in

Overview

Class Central Tips

Drive better business decisions with an overview of how big data is organized, analyzed, and interpreted. Apply your insights to real-world problems and questions. ********* Do you need to understand big data and how it will impact your business? This Specialization is for you. You will gain an understanding of what insights big data can provide through hands-on experience with the tools and systems used by big data scientists and engineers. Previous programming experience is not required! You will be guided through the basics of using Hadoop with MapReduce, Spark, Pig and Hive. By following along with provided code, you will experience how one can perform predictive modeling and leverage graph analytics to model problems. This specialization will prepare you to ask the right questions about data, communicate effectively with data scientists, and do basic exploration of large, complex datasets. In the final Capstone Project, developed in partnership with data software company Splunk, you’ll apply the skills you learned to do basic analyses of big data.

Syllabus

Course 1: Introduction to Big Data
- Offered by University of California San Diego. Interested in increasing your knowledge of the Big Data landscape? This course is for those ... Enroll for free.

Course 2: Big Data Modeling and Management Systems
- Offered by University of California San Diego. Once you’ve identified a big data issue to analyze, how do you collect, store and organize ... Enroll for free.

Course 3: Big Data Integration and Processing
- Offered by University of California San Diego. At the end of the course, you will be able to: *Retrieve data from example database and big ... Enroll for free.

Course 4: Machine Learning With Big Data
- Offered by University of California San Diego. Want to make sense of the volumes of data you have collected? Need to incorporate ... Enroll for free.

Course 5: Graph Analytics for Big Data
- Offered by University of California San Diego. Want to understand your data network structure and how it changes under different conditions? ... Enroll for free.

Course 6: Big Data - Capstone Project
- Offered by University of California San Diego. Welcome to the Capstone Project for Big Data! In this culminating project, you will build a ... Enroll for free.

Courses

34 reviews
17 hours 28 minutes
View details

Interested in increasing your knowledge of the Big Data landscape? This course is for those new to data science and interested in understanding why the Big Data Era has come to be. It is for those who want to become conversant with the terminology and the core concepts behind big data problems, applications, and systems. It is for those who want to start thinking about how Big Data might be useful in their business or career. It provides an introduction to one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible -- increasing the potential for data to transform our world! At the end of this course, you will be able to: * Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors. * Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting. * Get value out of Big Data by using a 5-step process to structure your analysis. * Identify what are and what are not big data problems and be able to recast big data problems as data science questions. * Provide an explanation of the architectural components and programming models used for scalable big data analysis. * Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model. * Install and run a program using Hadoop! This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Hardware Requirements: (A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size. Software Requirements: This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge. Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.
14 reviews
21 hours 34 minutes
View details

Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use to create machine learning models that learn from data, and to scale those models up to big data problems. At the end of the course, you will be able to: • Design an approach to leverage data using the steps in the machine learning process. • Apply machine learning techniques to explore and prepare data for modeling. • Identify the type of machine learning problem in order to apply the appropriate set of techniques. • Construct models that learn from data using widely available open source tools. • Analyze big data problems using scalable machine learning algorithms on Spark. Software Requirements: Cloudera VM, KNIME, Spark
6 reviews
13 hours 5 minutes
View details

Want to understand your data network structure and how it changes under different conditions? Curious to know how to identify closely interacting clusters within a graph? Have you heard of the fast-growing area of graph analytics and want to learn more? This course gives you a broad overview of the field of graph analytics so you can learn new ways to model, store, retrieve and analyze graph-structured data. After completing this course, you will be able to model a problem into a graph database and perform analytical tasks over the graph in a scalable manner. Better yet, you will be able to apply these techniques to understand the significance of your data sets for your own projects.
0 reviews
20 hours 32 minutes
View details

Welcome to the Capstone Project for Big Data! In this culminating project, you will build a big data ecosystem using tools and methods form the earlier courses in this specialization. You will analyze a data set simulating big data generated from a large number of users who are playing our imaginary game "Catch the Pink Flamingo". During the five week Capstone Project, you will walk through the typical big data science steps for acquiring, exploring, preparing, analyzing, and reporting. In the first two weeks, we will introduce you to the data set and guide you through some exploratory analysis using tools such as Splunk and Open Office. Then we will move into more challenging big data problems requiring the more advanced tools you have learned including KNIME, Spark's MLLib and Gephi. Finally, during the fifth and final week, we will show you how to bring it all together to create engaging and compelling reports and slide presentations. As a result of our collaboration with Splunk, a software company focus on analyzing machine-generated big data, learners with the top projects will be eligible to present to Splunk and meet Splunk recruiters and engineering leadership.
3 reviews
13 hours 26 minutes
View details

Once you’ve identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools. Through guided hands-on tutorials, you will become familiar with techniques using real-time and semi-structured data examples. Systems and tools discussed include: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existing untapped data sources and discovering new data sources. At the end of this course, you will be able to: * Recognize different data elements in your own work and in everyday life problems * Explain why your team needs to design a Big Data Infrastructure Plan and Information System Design * Identify the frequent data operations required for various types of data * Select a data model to suit the characteristics of your data * Apply techniques to handle streaming data * Differentiate between a traditional Database Management System and a Big Data Management System * Appreciate why there are so many data management systems * Design a big data information system for an online game company This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications. Hardware Requirements: (A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size. Software Requirements: This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge (except for data charges from your internet provider). Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.
3 reviews
17 hours 32 minutes
View details

At the end of the course, you will be able to: *Retrieve data from example database and big data management systems *Describe the connections between data management operations and the big data processing patterns needed to utilize them in large-scale analytical applications *Identify when a big data problem needs data integration *Execute simple big data integration and processing on Hadoop and Spark platforms This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications. Hardware Requirements: (A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size. Software Requirements: This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge (except for data charges from your internet provider). Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.

Taught by

Amarnath Gupta, Ilkay Altintas and Mai Nguyen

Reviews

1.3 rating, based on 16 Class Central reviews

Start your review of Big Data

Anonymous

Good content - very poor execution, evaluation, and practical excercises This course has all the right stuff, but fails in the execution phase. Content is appropriate, but after course one, the level of instruction suffers greatly. Basically sets…

Good content - very poor execution, evaluation, and practical excercises
This course has all the right stuff, but fails in the execution phase.

Content is appropriate, but after course one, the level of instruction suffers greatly. Basically sets of slides with a person in the corner reading to you from a script. Course one used appropriate examples to ensure students can grasp concepts. Course 2 examples were simply snippets of code that the instructor read to students in the video.

Access to faculty - non existent - all help is through peer message boards. Completely unsatisfactory.

Evaluations - quizzes were hit or miss in terms of content and quality control. Most became a guessing game due to the multiple answers on multiple choice questions. Questions did not match the content offered in the lectures in many cases which resulted in a print the quiz and go through the transcript looking for phrases that match. Again - unsatisfactory.

Practical exercises - good in course one. Appropriate in course two, but require programing skills in Python. No big deal if advertised up front, but the course description leads one to believe that coding skills are a must. Again - all help through peer message boards and the coding came down to a "lets try this" drill. PE instructions even point out that you may run into errors and will have to "experiment with the code". That's great if you have access to faculty or staff to assist, but disappointing if you are let on your own.

Advice to potential students - if you are comfortable with Python, don't mind being read too - go for it. If you are new or not current on programming skills, make sure you have a friend to help you with the coding concepts because you won't get any help form the course faculty.
Marat Bakiev

Finished first 3 courses out of 6.
I absolutely agree with others. Specialization is bad.

Yes, I learned something on the very (very, very) high level. I can download cloudera VM and perform some computations on the file that is more than the volume of RAM using Hue. But, of course, it is not the level that most of those who are interested in Hadoop was seeking for.

They try to teach you Hadoop on the level that is suitable maybe for Excel but nothing more than that.

Don't pay for it. Just check courses first (you can save your answers and take a certificate after that very fast).
Ericdo1810

Terrible Offering for a highly-technical subject Completed the 4th course, Machine Learning with Big Data, in a mere 2 hours. Storming through videos at x2.0 speed, still can grasp everything. Why? Because the contents are so generic there's nothing…

Terrible Offering for a highly-technical subject
Completed the 4th course, Machine Learning with Big Data, in a mere 2 hours. Storming through videos at x2.0 speed, still can grasp everything. Why? Because the contents are so generic there's nothing to really capture. Most of the stuffs you can read on WikiPedia about Machine Learning and Big Data, well, this course is an audio version of those info.

The hands-on assignments practically are just there for the sake of having some practical exercises. There are no explanations of purpose, meaning or even explanations about the significance of the design of the assignments.

Worst thing now, the quizzes. Can you believe a reputable university offering courses in this topic actually have 50% of the quiz questions as TRUE/FALSE????? Note that, Coursera has unlimited trials, hence, quizzes that are TRUE/FALSE are trivial beyond imagination. Worse still, True/False questions, but they are not the kind that requires you to think. They are the kind that say A is B, true or false? Of course False. Or A is A, True or False.

I can't say how much this entire specialization needs to be reworked. It's really frustrating to complete this course. I'm sure many others find the experience similar too.
Anonymous

Waste of time and money Sad to say but this specialization (at least first 3 courses at the moment) is worst I saw on Coursera. Lecturers show no evidences of ability to teach, have no own materials, ask obscure questions during quizzes with data no…

Waste of time and money
Sad to say but this specialization (at least first 3 courses at the moment) is worst I saw on Coursera. Lecturers show no evidences of ability to teach, have no own materials, ask obscure questions during quizzes with data not covered in videos/slides nor official documentation of used software solutions.
All issues with assignments or software are not covered by staff members and you only can rely on other students help.
Material which covers topics from 1st part of spec is just words about how cool is to work with Hadoop and Hadoop-based products.
2nd course talks about basics you can grab from official documentation.
3d course includes parts not used in modern Big Data installations (try to find actual info on Pig usage).
Anonymous

Worst. Specialization. Ever. I completed the first course of the series and started the second. Every other comment is spot on. There is ZERO effective instruction. One of the "lectures" consisted of the instructor reading the online tutorial (…

Worst. Specialization. Ever.
I completed the first course of the series and started the second. Every other comment is spot on. There is ZERO effective instruction.

One of the "lectures" consisted of the instructor reading the online tutorial (but not admitting that she was just doing this). Worse yet, in the video, she overlooked some of the steps in the tutorial, so that any student who tried to follow the video step-by-step would be unable to replicate the instructor's results.

All of the videos were set up as Powerpoint presentations (even when they were trying to walk through some practical programming exercises). The quizzes were jokes, and the assignments were completely unrelated to the instruction videos.
Victor Pillac

Poor value
Dropped after the second course.
The introduction looked interesting, but the "Hadoop Platform and Application Framework" was extremely disappointing.
The lectures are not structured and presented in an interesting way (and copying illustrations from the web does not help). The lecturer reads the bullet points or a script, and repeats a lot of what was said in the previous course from the specialization. The coding exercises basically consist in copy/pasting code and execute it, while the quizzes test for simple facts.
I was considering doing the full specialization but I am glad I had a look at the content first, for the price one would expect much higher quality.
Anonymous

Bugs in scripts, poor audio, glossing over important details lead to a poor learning experience
The title says it all. I've been in IT for over 35 years. I consider myself technically competent. I wanted exposure to Hadoop. I'm enrolled in the specialization. I spend the majority of my time trying to decipher the lecture (ironically, even subtitles flash [inaudible] on a regular basis) and understand the concepts presented at lightning speed. Thank goodness I'm proficient with UNIX and she'll scripting. Non-technical people should NOT take this course. Finally, zero support from UC.
Anonymous

Don't start here. Take more Python instead.
I took the first 3 of 6, and after reading these reviews decided not to proceed to 4. I put in the recommend effort and a bit extra.My grades were fine, but the courses don't seem fully tested or cooked. Get your python skills in order before tackling course 2. There are good resources on the internet to support you in the course outside of the course (manuals, tutorials, etc.). I took them out of curiousity. The other comments are pretty much on point. I did use Excel a few times to check/validate my results.
Anonymous

Interesting and good, even if not full satisfying for a software engineer
As a software developer with poor knowledge of the big data environment, I would say that this course is useful, it will open your mind to the infrastructure and algorithm used, but it does not goo so deep in technical details. But at least it scratch the surface of the Hadoop framework, so you end up with some more material to research on and you get in touch with the basic concept of the Big Data universe.
Anonymous

I would give no star if I had a choice
Poor quality material for the cost of the course.
Grading and peer assignments are a joke. Despite having correct assignments, there is a chance you will fail those assignments since peers who did not understand the course will evaluate you. There is no way to guarantee that experts will review your work.
Better to take free courses on big data and learn by yourself.
Anonymous

Badly delivered of an otherwise very interesting subject.
If you want to learn Big Data, go somewhere else. UCSD's Coursera specialization really screwed up. Materials are okay, but are badly organized and delivered, like scatter-brained. They also messed up royally on the final capstone, not only delayed 5-6 months, but also put out some really pitiful idea for the project.
Juri

Terrible
Don't buy this specialization. I'm on the 3rd course now, and If I knew it before I wouldn't take it even for free. Absolutely useless assignments and quizzes. Videos just reading intro tutorials. Practical assignments are not explained correctly. Tutors are absent from discussions.
Anonymous

They changes the specialization without notice and I lost all my work & money
Coursera changed the specialization without prior notice. I have finished 4 courses out of 5 and now I lost it all because they decided to change the specialization without prior notice. Quite a scam in my opinion.
Anonymous

The worst Coursera course ever. I wish I could give neagtive starts.
Steer away from this course. Waste of time and money.
They should return the money to everybody attending it with an letter of apology.
The only reason I finished it is because I paid for it.
Gerrit Klaschke

complete waste of money
No content, very slow and absolutely not worth the money. Promised was 3 weeks of 3-5 hours a week of content/work. It was 1.5 hours ALLTOGETHER. False advertisement and no assistance from Coursera staff or UCSD.
Anonymous

Awful.
Course is poorly designed. Material is covered extremely superficially. Lectures have factual errors. Quizzes are poorly worded and dont really test understanding. Stay away from this course. It is a waste of time and money.

Go to class

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Most common

Popular subjects

Popular courses

Big Data

Overview

Syllabus

Courses

Introduction to Big Data

Machine Learning With Big Data

Graph Analytics for Big Data

Big Data - Capstone Project

Big Data Modeling and Management Systems

Big Data Integration and Processing

Taught by

Tags

Reviews

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Introduction to Big Data

Machine Learning With Big Data

Graph Analytics for Big Data

Big Data - Capstone Project

Big Data Modeling and Management Systems

Big Data Integration and Processing

Taught by

Tags

Python Data Products for Predictive Analytics

NoSQL, Big Data, and Spark Foundations

Taming Big Data with MapReduce and Hadoop - Hands On!

Big Data and Hadoop for Beginners - with Hands-on!

Big Data Essentials

NoSQL, Big Data, and Spark Foundations

1700 Coursera Courses That Are Still Completely FREE

250 Top FREE Coursera Courses of All Time

Massive List of MOOC-based Microcredentials

Never Stop Learning.