subject

Coursera: Genomic Data Science with Galaxy

 with  James Taylor
Learn to use the tools that are available from the Galaxy Project. This is the second course in the Genomic Big Data Science Specialization.

Syllabus

Introduction
This week, we will present some of the research challenges that motivated the development of the Galaxy framework. We will then introduce Galaxy, describe what the Galaxy framework is, and look at different ways you can use it.

Galaxy 101
In this module and the following modules we will start to use Galaxy to perform different types of analysis.

Working with sequence data
In this module we will be studying sequence data quality control as well as ChIP-Sequence Analysis with MACS.

RNA-seq & Running your own Galaxy
In these final modules, we'll take a look at working with sequence data and RNA-seq and at installing and running your own Galaxy.

12 Student
reviews
Cost Free Online Course (Audit)
Subject Bioinformatics
Provider Coursera
Language English
Certificates Paid Certificate Available
Hours 3-5 hours a week
Calendar 4 weeks long
Sign up for free? Learn how

Disclosure: To support our site, Class Central may be compensated by some course providers.

+ Add to My Courses
FAQ View All
What are MOOCs?
MOOCs stand for Massive Open Online Courses. These are free online courses from universities around the world (eg. Stanford Harvard MIT) offered to anyone with an internet connection.
How do I register?
To register for a course, click on "Go to Class" button on the course page. This will take you to the providers website where you can register for the course.
How do these MOOCs or free online courses work?
MOOCs are designed for an online audience, teaching primarily through short (5-20 min.) pre recorded video lectures, that you watch on weekly schedule when convenient for you.  They also have student discussion forums, homework/assignments, and online quizzes or exams.

12 reviews for Coursera's Genomic Data Science with Galaxy

Write a review
5 out of 5 people found the following review useful
2 years ago
Brandt Pence completed this course, spending 3 hours a week on it and found the course difficulty to be easy.
I took the first offering of this, the second course in the Genomic Data Science specialization, and there are a number of issues that I hope the specialization team can work out. The video lectures are short (20-30 minutes total per module), and the introduction to working with Galaxy is reasonably interesting. The ex Read More
I took the first offering of this, the second course in the Genomic Data Science specialization, and there are a number of issues that I hope the specialization team can work out. The video lectures are short (20-30 minutes total per module), and the introduction to working with Galaxy is reasonably interesting. The explanations given by Dr. Taylor are fairly good, but as with other courses in this specialization and in the Data Science specialization, the depth of the instruction is not quite enough to prepare students for the final project.

A fair amount of time is spent on demonstrating how to run the Galaxy software system through the cloud or locally on your own machine. Some of this is problematic for several reasons. First, Galaxy does not play well with Windows, and the only reliable way to install Galaxy on a Windows machine is to run an instance of Linux (e.g. Ubuntu) either as a second OS or as a virtual machine. The instructors also suggest Amazon AWS as a cloud provider for those wanting to run Galaxy on the cloud.

You do not need to do any of this! By all means watch the videos so you know how it's done (it's required to take the associated quiz anyway). One person posted in the forums that he was charged $60 after he left several instances running in his Amazon AWS account overnight, even though they weren't actually doing anything. Others later reported similar charges of $100-$300. The Galaxy website allows 250GB storage per registered user and has plenty of processor time to allow students to run the tools to complete the demonstrations and final project in a reasonably timely fashion, especially if you do it at night when most of the researchers using the platform have finished their work for the day.

The final project requires you to determine the number and type of variants from sequence data from a father/mother/daughter trio. The tools are all available in Galaxy main, but there is not enough background information given to make the process intuitive, nor were certain essential questions answered (ex: should I analyze each sample individually or pool all the subjects into the same sample before analysis?). One issue was that, this being the first instance of this course, there were no community TAs available to answer questions, so students had to rely on the instructor for guidance, and he was of course not often involved on the discussion forums.

Some internet resources are available to help with this (see this guide on variant calling and this Nature Genetics article). As a note, though, I could not get the listed workflow from the first resource to work for the data we were given and ended up trying to work through it by essentially picking tools based on their names/descriptions. I was able to get an answer, but there was no way to tell prior to submission if my answer was correct, close, or completely wrong. Unfortunately, it ended up that there was no way to tell when evaluating other students' projects whether their answers were right or wrong either, despite the fact that the rubric asks you to do just that. Additionally, the rubric asks evaluators to assess whether a particular variant was present is the .vcf file, and this variant was not called using the hg19 reference genome that was used for the Galaxy demonstrations (and confirmed by the instructor in one of his rare forum posts to be appropriate for the project). To his credit, the instructor resolved this (after someone sent him a direct email).

Overall, two stars. There is some value here, but the expectations aren't particularly clear, and the course project (if done correctly) is well beyond what is taught in the lectures. This would be fine (and indeed it's characteristic of the Data Science Specialization and this specialization), but the resources online that might help with actually completing the project are confusing, contradictory, or deprecated.
Was this review helpful to you? YES | NO
8 out of 8 people found the following review useful
2 years ago
Adelyne Chan completed this course, spending 10 hours a week on it and found the course difficulty to be very hard.
This course is so inconsistent I don't even know how to write a short review on it. Content-wise, from the videos, there is nothing too difficult and most of the topics taught are fairly easy to follow. The same goes for the quizzes where most of the content is from the lectures, or simple analyses to be performed on p Read More
This course is so inconsistent I don't even know how to write a short review on it. Content-wise, from the videos, there is nothing too difficult and most of the topics taught are fairly easy to follow. The same goes for the quizzes where most of the content is from the lectures, or simple analyses to be performed on provided datasets on the Galaxy web platform.

Cut to the course project and it is a completely different story. The execution of the project is extremely poor, there is insufficient background information provided in the lectures and the course project instructions themselves are vague. I understand the nature of genomic data science which involves a lot of collaboration and searching on the web for the appropriate tools (as pointed out by the instructor on the forums, who also encouraged free discussion on the forums wrt the project), but some level of background and/or forum support by staff needs to be provided.

On the bright side I have never seen such a collaborative group of students on a MOOC, when anyone had an issue almost immediately someone would jump in and clarify a concept or try to explain how to do a task (well, we had to since the staff weren't very helpful). The evaluations were nearly as difficult as doing the course project to start with, and I get the vibe that most people were just giving everyone full marks just because they did not know how to or it was just too time consuming to perform a proper evaluation which is a pity as peer evaluation is a wonderful opportunity to learn.

The other major problem faced was a topic covered in the videos which was aimed at more advanced students where the instructor demonstrated how to set up a local instance of Galaxy supported by Amazon Web Services. I watched the video but did not attempt to do so myself, but because terminating the instance was not covered in the lectures many people who did try out what the instructor was demonstrating found themselves being charged hefty sums by Amazon because of the continued use of cloud services (I think the highest charge I saw was over 700 dollars). Naturally students weren't too happy about that!

Galaxy is a tremendously useful platform but this course probably isn't the way to go to learn how to use it. There are many tutorials on Galaxy itself which are probably as informative as doing this course.
Was this review helpful to you? YES | NO
12 out of 13 people found the following review useful
2 years ago
profile picture
Anonymous completed this course.
I'm a biologist who has some familiarity with Galaxy and I just about gave up during the course project (I believe the gamer term is "rage quit"). The project instructions were completely unclear and the material needed was difficult to find. I spent probably 25 hours on the project (in addition to the several hours Read More
I'm a biologist who has some familiarity with Galaxy and I just about gave up during the course project (I believe the gamer term is "rage quit"). The project instructions were completely unclear and the material needed was difficult to find. I spent probably 25 hours on the project (in addition to the several hours per week doing all the other lectures etc.). The forums were helpful but it was also annoying to see people come in at the last min and get told how to set everything up correctly. Grading was a complete farce because no one understood the project instructions to begin with.

The other problem was that this class wasted a ton of time telling us about Amazon cloud services and pricing structures etc. Why on earth would I care about running huge projects in the cloud if I don't know how to use Galaxy in the first place!? Also, people who followed along with the Amazon instructions ended up with huge charges because they weren't told how to shut the service down... Anything you need to do in this class can be done on the Galaxy website for free!

I signed up for the entire specialization but I don't think I will bother taking the other courses after my experience with this one.
Was this review helpful to you? YES | NO
2 out of 2 people found the following review useful
2 years ago
Juan Reza partially completed this course.
This course, Genomic Data Science with Galaxy, is the worst online science course in this Johns Hopkins specialization. Some observations: * Instructor claims that Galaxy is capable and good for reproducibility. Wrong on many points: when you start it up, it pulls in numerous pieces of software from all over the web Read More
This course, Genomic Data Science with Galaxy, is the worst online science course in this Johns Hopkins specialization. Some observations:

* Instructor claims that Galaxy is capable and good for reproducibility. Wrong on many points: when you start it up, it pulls in numerous pieces of software from all over the web which are not under its control so you get different versions or work in progress. * no control over input sources at all.

Instructor acknowledged that you can't get the same results, just approximate, but he still doesn't get that this makes it not-reproducible. This is a deep architectural flaw that won't just settle down. Galaxy is stuck jusing Python 2.7 and can't upgrade to 3.x.

* Some students ended up paying for an online version that the Instructor said was free. He later issued a warning. (ouch!)

* versions of tools have manually entered version ids, some repeated for different versions so you can't tell what you got.

* on both Mac and PC/virtual box installation and startup reveals numerous debug errors, some fatal; different from time to time, indicating that dependencies are being updated without control of Galaxy. Instructor doesn't get that this means Galaxy is unstable.

* Instructor unable to diagnose and offer a fix when one of the essential buttons "upload" does not even appear (TA verified this malfunction and couldn't fix it).

* Instructor claims that you can install and run Galaxy locally via virtual box. However, it doesn't work. period. I tried in 3 different enrollments in this course. He claims that its for "advanced students" (I'm a software developer, over 25 years and TA for the Command-Line course). Instructor fails to be able to provide a step by step installation or understand the error messages and found the instructor to be rather terse or uninformed in responding to some questions.

* content was "so so" compared to other courses covering similar bio analytic techniques (other schools have brilliant, mature material).

In short, I was never able to even start doing the programming assignments due to instability of the Galaxy tool itself and incompetence of the staff (they don't know what their doing).
Was this review helpful to you? YES | NO
4 out of 4 people found the following review useful
2 years ago
Bruno Lehouque completed this course, spending 6 hours a week on it and found the course difficulty to be hard.
This course teaches you how to use Galaxy, an interface with a bunch of tools to perform genomics analysis. But be warned, this course assumes that you are a seasoned biologist. There are some examples of how to use the tools to do basic analysis in the lectures but not a word about why you'd do it and what the result Read More
This course teaches you how to use Galaxy, an interface with a bunch of tools to perform genomics analysis. But be warned, this course assumes that you are a seasoned biologist.

There are some examples of how to use the tools to do basic analysis in the lectures but not a word about why you'd do it and what the results mean.

The project is a nightmare. There's absolutely no guidance and no support from the staff. There's nothing in the lectures to explain what you need to do and how.

Basically, your job is to search the web for a workflow corresponding to the problem and try to reproduce it if you're lucky enough to get the tools working without bugs and unexpected errors.

Grading your peers is almost as difficult as completing the project itself since no two students end up with the same answers and we're not even told what the expected results are.

The timing is also bad. You're expected to finish this project on week 3 (there are 4), which is much too early.

All in all, that has been my worst MOOC experience (by far). This was the first session though so let's hope that the staff will work to improve the course because as it is now, I wouldn't recommend it at all.
Was this review helpful to you? YES | NO
3 out of 3 people found the following review useful
2 years ago
profile picture
Anonymous partially completed this course.
A sort of a con operation by John Hopkins: create a bunch of watery courses, charge 49 dollars for each, run each one once per month. This one is totally useless: four lectures of 20-25 minutes and a final project that is impossible to complete based on those lectures. The staff was mostly absent from the course -- c Read More
A sort of a con operation by John Hopkins: create a bunch of watery courses, charge 49 dollars for each, run each one once per month. This one is totally useless: four lectures of 20-25 minutes and a final project that is impossible to complete based on those lectures. The staff was mostly absent from the course -- completely absent in the second half when the students realized they could not complete the final project.
Was this review helpful to you? YES | NO
3 out of 3 people found the following review useful
2 years ago
Jianyi Ren is taking this course right now.
I was struggling hopelessly on the final project, originally expecting it would be as easy as the quiz, only to find it's surprisingly clueless and extremely demanding. And after reading the reviews from others I found I am not the only one. Galaxy is a cool platform, I am grateful to the Professor and staff of JHU w Read More
I was struggling hopelessly on the final project, originally expecting it would be as easy as the quiz, only to find it's surprisingly clueless and extremely demanding. And after reading the reviews from others I found I am not the only one.

Galaxy is a cool platform, I am grateful to the Professor and staff of JHU who share the teaching resources. Some more clear instruction will be very helpful to us. Many thanks.

Was this review helpful to you? YES | NO
1 out of 1 people found the following review useful
2 years ago
Ben M. completed this course, spending 8 hours a week on it and found the course difficulty to be medium.
I found going through the galaxy tutorials very helpful. I recommend doing that as well. If things get hard the community TA on Coursera and participants really helped me out. Someone recommended that doing command line tools for genomic data science really helped them.Definitely, worthwhile course.
Was this review helpful to you? YES | NO
0 out of 1 people found the following review useful
a year ago
profile picture
Anonymous completed this course.
Avoid this course at all cost. You will not learn anything. Instructor just keeps clicking on galaxy platform, he doesn't explain anything. Final project is nightmare.

This is by far the worst course on Coursera.com I've finished (I've done plenty courses).
Was this review helpful to you? YES | NO
2 months ago
Kiran R is taking this course right now, spending 6 hours a week on it and found the course difficulty to be very hard.
Worst course ever on coursera. The assignment has no relationship to what is taught in class. Examples considered are worthless

People take this assuming JHU is a good brand but seems like JHU name is being spoilt by the course teachers

AVOID THIS COURSE
Was this review helpful to you? YES | NO
0 out of 1 people found the following review useful
2 years ago
Colin Khein partially completed this course.
Was this review helpful to you? YES | NO
0 out of 5 people found the following review useful
2 years ago
profile picture
Ali Mohamed Ali Kishk partially completed this course.
Was this review helpful to you? YES | NO

Class Central

Get personalized course recommendations, track subjects and courses with reminders, and more.

Sign up for free