To support our site, Class Central may be compensated by some course providers.

Taken this course? Share your experience with other students. Write review

Overview

Exploratory data analysis is an approach for summarizing and visualizing the important characteristics of a data set. Promoted by John Tukey, exploratory data analysis focuses on exploring data to understand the data’s underlying structure and variables, to develop intuition about the data set, to consider how that data set came into existence, and to decide how it can be investigated with more formal statistical methods.

If you're interested in supplemental reading material for the course check out the Exploratory Data Analysis book. (Not Required)

This course is also a part of our Data Analyst Nanodegree.



Why Take This Course?

You will...

  • Understand data analysis via EDA as a journey and a way to explore data
  • Explore data at multiple levels using appropriate visualizations
  • Acquire statistical knowledge for summarizing data
  • Demonstrate curiosity and skepticism when performing data analysis
  • Develop intuition around a data set and understand how the data was generated.

Syllabus

Lesson 1: What is EDA? (1 hour)

We'll start by learn about what exploratory data analysis (EDA) is and why it is important. You'll meet the amazing instructors for the course and find out about the course structure and final project.

Lesson 2: R Basics (3 hours)

EDA, which comes before formal hypothesis testing and modeling, makes use of visual methods to analyze and summarize data sets. R will be our tool for generating those visuals and conducting analyses. In this lesson, we will install RStudio and packages, learn the layout and basic commands of R, practice writing basic R scripts, and inspect data sets.

Lesson 3: Explore One Variable (4 hours)

We perform EDA to understand the distribution of a variable and to check for anomalies and outliers. Learn how to quantify and visualize individual variables within a data set as we begin to make sense of a pseudo-data set of Facebook users. While the data set does not contain real user data, it does contain a wealth of information. Through the lesson, we will create histograms and boxplots, transform variables, and examine tradeoffs in visualizations.

Problem Set 3 (2 hours)

Lesson 4: Explore Two Variables (4 hours)

EDA allows us to identify the most important variables and relationships within a data set before building predictive models. In this lesson, we will learn techniques for exploring the relationship between any two variables in a data set. We'll create scatter plots, calculate correlations, and investigate conditional means.

Problem Set 4 (2 hours)

Lesson 5: Explore Many Variables (4 hours)

Data sets can be complex. In this lesson, we will learn powerful methods and visualizations for examining relationships among multiple variables. We'll learn how to reshape data frames and how to use aesthetics like color and shape to uncover more information. Extending our knowledge of previous plots, we'll continue to build intuition around the Facebook data set and explore some new data sets as well.

Problem Set 5 (2 hours)

Lesson 6: Diamonds and Price Predictions (2 hours)

Investigate the diamonds data set alongside Facebook Data Scientist, Solomon Messing. He'll recap many of the strategies covered in the course and show how predictive modeling can allow us to determine a good price for a diamond. As a final project, you will create your own exploratory data analysis on a data set of your choice.

Final Project (10+ hours)

You've explored simulated Facebook user data and the diamonds data set. Now, it's your turn to conduct your own exploratory data analysis. Choose one data set to explore (one provided by Udacity or your own) and create a RMD file that uncovers the patterns, anomalies and relationships of the data set.

Taught by

Moira Burke and Dean Eckles

Class Central Charts

Help Center

Most commonly asked questions about Udacity Udacity

Reviews for Udacity's Data Analysis with R
4.6 Based on 18 reviews

  • 5 stars 72%
  • 4 stars 11%
  • 3 stars 17%
  • 2 star 0%
  • 1 star 0%

Did you take this course? Share your experience with other students.

Write a review
  • 1
Life S
5.0 4 years ago
Life completed this course.
The course provides an overview of using R to explore data and focuses heavily on the use of the ggplot2 package in R to create data visualizations. Although the course touches briefly on high-level theory and concepts like summary statistics, transforming data, correlation and linear regression, almost all of the quizzes and homework questions have to do with creating plots and making observations based on plots. This is not necessarily a bad thing--learning to plot in R is a valuable skill and an important part of exploratory data analysis--but it seems like the course should have spent a bit more time covering high-level concepts and numeric methods for exploring data like using tables and summaries. Despite that quibble, this is good course with a lot of high quality and practical content. It moves slowly enough for you to get comfortable with basic potting syntax before building up to more complex visualizations, but fast enough to keep you engaged.
11 people found
this review helpful
Was this review helpful to you? Yes
Pravin M
5.0 9 months ago
by Pravin completed this course, spending 10 hours a week on it and found the course difficulty to be medium.
This was the first course I took since I started thinking about analytics and R. A fellow Data Scientist recommended it to me. I was bit surprised when I saw the level as Intermediate still decided to pursue. Duration of the course is 2 months and that's what it took me to complete it with 2-3 hours a day.

About the course -

If you are new to R, it will not teach you the ABC of it, but believe me I never felt the need of it though the only programming language I knew was COBOL. It is a primarily an analytics course and gets one well versed with the Data Analytics con…
Was this review helpful to you? Yes
Anonymous
5.0 4 years ago
Anonymous completed this course.
Very enjoyable class and I learned a lot. If you are new to R and are intimidated by the GGPlot2 package, this is for you.
6 people found
this review helpful
Was this review helpful to you? Yes
Joe F
5.0 8 months ago
by Joe is taking this course right now, spending 8 hours a week on it and found the course difficulty to be medium.
I was skeptical when I enrolled in UDACITY's Data Analysis Nano Degree Program but not only have they provided the experience they said they would they have steadily made improvements since I enrolled. How many times in your life have you had that experience? Here are SOME of the improvements they have made while I have been enrolled. Initially one could get one-on-one help but usually it was 1 to 2 days out but at least was a video chat.

This was great. I had tried a competitor's course and sometime s one just cannot figure out why something is not working. But not w…
Was this review helpful to you? Yes
Anonymous
3.0 4 years ago
Anonymous completed this course.
OK course, very short.

Mostly goes over how to plot in R, except for the final week which is very interesting.
3 people found
this review helpful
Was this review helpful to you? Yes
Armand O
5.0 2 years ago
by Armand completed this course.
If you are looking for a good start on the topic of data visualization in R. This is the best choice in the web.
1 person found
this review helpful
Was this review helpful to you? Yes
Ben H
4.0 3 years ago
by Ben is taking this course right now.
0 person found
this review helpful
Was this review helpful to you? Yes
Prateek A
5.0 2 years ago
by Prateek partially completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
Rishil A
3.0 3 years ago
by Rishil is taking this course right now.
0 person found
this review helpful
Was this review helpful to you? Yes
Francesca G
5.0 2 years ago
by Francesca completed this course, spending 40 hours a week on it and found the course difficulty to be medium.
1 person found
this review helpful
Was this review helpful to you? Yes
Lukas T
5.0 4 years ago
by Lukas completed this course and found the course difficulty to be medium.
Was this review helpful to you? Yes
Chandra.j C
5.0 4 years ago
Chandra.j is taking this course right now, spending 2 hours a week on it and found the course difficulty to be very easy.
Was this review helpful to you? Yes
Sérgio B
5.0 3 years ago
by Sérgio is taking this course right now.
Was this review helpful to you? Yes
Vinay S
3.0 2 years ago
by Vinay completed this course.
Was this review helpful to you? Yes
Ryosuke K
5.0 3 years ago
Ryosuke is taking this course right now.
Was this review helpful to you? Yes
Clover B
4.0 2 years ago
by Clover dropped this course.
Was this review helpful to you? Yes
François A
5.0 3 years ago
by François is taking this course right now.
Was this review helpful to you? Yes
William H
5.0 3 years ago
by William is taking this course right now.
Was this review helpful to you? Yes
  • 1

Class Central

Get personalized course recommendations, track subjects and courses with reminders, and more.