subject
Intro

Udacity: Intro to Data Science

 with  Dave Holtz
Sponsored
Data Analytics Certificate
Cornell University via eCornell
Sponsored
Intro to Data Analysis
Facebook via Udacity
The Introduction to Data Science class will survey the foundational topics in data science, namely:

* Data Manipulation
* Data Analysis with Statistics and Machine Learning
* Data Communication with Information Visualization
* Data at Scale -- Working with Big Data

The class will focus on breadth and present the topics briefly instead of focusing on a single topic in depth. This will give you the opportunity to sample and apply the basic techniques of data science.

This course is also a part of our Data Analyst Nanodegree.

Why Take This Course?
You will have an opportunity to work through a data science project end to end, from analyzing a dataset to visualizing and communicating your data analysis.

Through working on the class project, you will be exposed to and understand the skills that are needed to become a data scientist yourself.

Syllabus

### Lesson 1: Introduction to Data Science

- Introduction to Data Science
- What is a Data Scientist
- Pi-Chaun (Data Scientist @ Google): What is Data Science?
- Gabor (Data Scientist @ Twitter): What is Data Science?
- Problems Solved by Data Science
- Pandas
- Dataframes
- Create a New Dataframe

### Lesson 2: Data Wrangling

- What is Data Wrangling?
- Acquiring Data
- Common Data Formats
- What are Relational Databases?
- Aadhaar Data
- Aadhaar Data and Relational Databases
- Introduction to Databases Schemas
- API’s
- Data in JSON Format
- How to Access an API efficiently
- Missing Values
- Easy Imputation
- Impute using Linear Regression
- Tip of the Imputation Iceberg

### Lesson 3: Data Analysis

- Statistical Rigor
- Kurt (Data Scientist @ Twitter) - Why is Stats Useful?
- Introduction to Normal Distribution
- T Test
- Welch T Test
- Non-Parametric Tests
- Non-Normal Data
- Stats vs. Machine Learning
- Different Types of Machine Learning
- Prediction with Regression
- Cost Function
- How to Minimize Cost Function
- Coefficients of Determination

### Lesson 4: Data Visualization

- Effective Information Visualization
- Napoleon's March on Russia
- Don (Principal Data Scientist @ AT&T): Communicating Findings
- Rishiraj (Principal Data Scientist @ AT&T): Communicating Findings Well
- Visual Encodings
- Perception of Visual Cues
- Plotting in Python
- Data Scales
- Visualizing Time Series Data

### Lesson 5: MapReduce

- Big Data and MapReduce
- Basics of MapReduce
- Mapper
- Reducer
- MapReduce with Aadhaar Data
- MapReduce with Subway Data
12 Student
reviews
Cost Free Online Course
Pace Self Paced
Subject Data Science
Provider Udacity
Language English
Hours 6 hours a week
Calendar 8 weeks long
+ Add to My Courses
Learn Data Analysis udacity.com

Learn to become a Data Analyst. Job offer guaranteed or get a full refund.

Advertisement
Become a Data Scientist datacamp.com

Learn Python & R at your own pace. Start now for free!

Advertisement
FAQ View All
What are MOOCs?
MOOCs stand for Massive Open Online Courses. These are free online courses from universities around the world (eg. Stanford Harvard MIT) offered to anyone with an internet connection.
How do I register?
To register for a course, click on "Go to Class" button on the course page. This will take you to the providers website where you can register for the course.
How do these MOOCs or free online courses work?
MOOCs are designed for an online audience, teaching primarily through short (5-20 min.) pre recorded video lectures, that you watch on weekly schedule when convenient for you.  They also have student discussion forums, homework/assignments, and online quizzes or exams.

12 reviews for Udacity's Intro to Data Science

Write a review
10 out of 10 people found the following review useful
3 years ago
profile picture
Life is Study completed this course.
Intro to data science is an intermediate level course that assumes basic Python programming skills and knowledge of statistics. The course focuses on gathering, manipulating, analyzing and visualizing data using Python and various Python packages such as numpy, scipy and pandas. One of the best parts about this course Read More
Intro to data science is an intermediate level course that assumes basic Python programming skills and knowledge of statistics. The course focuses on gathering, manipulating, analyzing and visualizing data using Python and various Python packages such as numpy, scipy and pandas. One of the best parts about this course is getting some exposure to some Python packages in the scipy stack, although I wish more time was devoted to explaining what the various modules in the scipy stack do, how to set them up at home and when to use them.

The first lesson was fairly gentle introduction with an interesting homework project dealing with data from the Titanic disaster. Lesson 2 goes into more detail about gathering and cleaning data using Pandas and an additional module that lets you make SQL queries to extract data from Pandas data frames. Lesson 3 jumps into data analysis with a T test and linear regression using gradient descent. Going from basic data manipulation into these topics was a bit jarring in terms of difficulty and more time could have been spent explaining how the functions worked. I left without a great appreciation of what gradient descent is really doing. Lesson 4 is focused on making visualizations using a module that attempts to port the functionality R language’s ggplot2 plotting package. Finally, lesson 5 introduces the concept of big data and MapReduce as a solution to deal with large data sets. Each homework assignment after the first has students dealing with New York subway turnstile data, which allows students to get some level of familiarity with the data throughout the course. This was a very good decision, since it lets students focus on learning new concepts rather than spending time familiarizing themselves with new data sets over and over again.
Was this review helpful to you? YES | NO
4 out of 5 people found the following review useful
3 years ago
Lukas Tencer completed this course and found the course difficulty to be medium.
It brings introduction in many areas, but it does not go into depth to any area. For more advanced classes look for other courses on Udacity. Good as introduction.
Was this review helpful to you? YES | NO
1 out of 2 people found the following review useful
2 years ago
Shahrukh Ahmed partially completed this course, spending 5 hours a week on it and found the course difficulty to be easy.
Though the course uses interesting examples for teaching concepts in relation to data science, the over reliance of the online grader for practice often makes learning redundant. Big part of learning programming is experimentation which the grader does not allow for.
Was this review helpful to you? YES | NO
0 out of 2 people found the following review useful
2 years ago
profile picture
Rafael Prados completed this course.
Was this review helpful to you? YES | NO
0 out of 2 people found the following review useful
2 years ago
Manohar Balineni is taking this course right now.
Was this review helpful to you? YES | NO
0 out of 1 people found the following review useful
2 years ago
profile picture
Tracy is taking this course right now.
Was this review helpful to you? YES | NO
2 years ago
Sérgio Den Boer is taking this course right now.
Was this review helpful to you? YES | NO
2 years ago
profile picture
Robert Pop is taking this course right now.
Was this review helpful to you? YES | NO
a year ago
Rog Josep partially completed this course.
Was this review helpful to you? YES | NO
4 months ago
Fais Alqorni partially completed this course.
Was this review helpful to you? YES | NO
2 years ago
profile picture
Anonymous completed this course.
Was this review helpful to you? YES | NO
a year ago
Caio Taniguchi audited this course.
Was this review helpful to you? YES | NO

Write a review

How would you rate this course? *
How much of the course did you finish? *
Review
Create Review