subject
Intro

Coursera: Text Retrieval and Search Engines

Sponsored
Minimum Viable Product
University of Technology Sydney via Open Learning
Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

This course will cover search engine technologies, which play an important role in any data mining applications involving text data for two reasons. First, while the raw data may be large for any particular problem, it is often a relatively small subset of the data that are relevant, and a search engine is an essential tool for quickly discovering a small subset of relevant text data in a large text collection. Second, search engines are needed to help analysts interpret any patterns discovered in the data by allowing them to examine the relevant original text data to make sense of any discovered pattern. You will learn the basic concepts, principles, and the major techniques in text retrieval, which is the underlying science of search engines.

Syllabus

Orientation
You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course.

Week 1
During this week's lessons, you will learn of natural language processing techniques, which are the foundation for all kinds of text-processing applications, the concept of a retrieval model, and the basic idea of the vector space model.

Week 2
In this week's lessons, you will learn how the vector space model works in detail, the major heuristics used in designing a retrieval function for ranking documents with respect to a query, and how to implement an information retrieval system (i.e., a search engine), including how to build an inverted index and how to score documents quickly for a query.

Week 3
In this week's lessons, you will learn how to evaluate an information retrieval system (a search engine), including the basic measures for evaluating a set of retrieved results and the major measures for evaluating a ranked list, including the average precision (AP) and the normalized discounted cumulative gain (nDCG), and practical issues in evaluation, including statistical significance testing and pooling.

Week 4
In this week's lessons, you will learn probabilistic retrieval models and statistical language models, particularly the detail of the query likelihood retrieval function with two specific smoothing methods, and how the query likelihood retrieval function is connected with the retrieval heuristics used in the vector space model.

Week 5
In this week's lessons, you will learn feedback techniques in information retrieval, including the Rocchio feedback method for the vector space model, and a mixture model for feedback with language models. You will also learn how web search engines work, including web crawling, web indexing, and how links between web pages can be leveraged to score web pages.

Week 6
In this week's lessons, you will learn how machine learning can be used to combine multiple scoring factors to optimize ranking of documents in web search (i.e., learning to rank), and learn techniques used in recommender systems (also called filtering systems), including content-based recommendation/filtering and collaborative filtering. You will also have a chance to review the entire course.

13 Student
reviews
Cost Free Online Course (Audit)
Pace Upcoming
Provider Coursera
Language English
Certificates Paid Certificate Available
Hours 4-6 hours a week
Calendar 6 weeks long
Sign up for free? Learn how

Disclosure: To support our site, Class Central may be compensated by some course providers.

+ Add to My Courses
FAQ View All
What are MOOCs?
MOOCs stand for Massive Open Online Courses. These are free online courses from universities around the world (eg. Stanford Harvard MIT) offered to anyone with an internet connection.
How do I register?
To register for a course, click on "Go to Class" button on the course page. This will take you to the providers website where you can register for the course.
How do these MOOCs or free online courses work?
MOOCs are designed for an online audience, teaching primarily through short (5-20 min.) pre recorded video lectures, that you watch on weekly schedule when convenient for you.  They also have student discussion forums, homework/assignments, and online quizzes or exams.

Reviews for Coursera's Text Retrieval and Search Engines
3.2 Based on 13 reviews

  • 5 stars 15%
  • 4 stars 38%
  • 3 stars 15%
  • 2 stars 15%
  • 1 stars 15%

Did you take this course? Share your experience with other students.

Write a review
  • 1
3.0 3 years ago
by Gregory J Hamel ( Life Is Study) audited this course and found the course difficulty to be medium.
Text Retrieval and Search Engines is the second course in Coursera's new data mining specialization offered by the University of Illinois at Urbana-Champaign. The course covers a variety of topics in text data mining and natural language processing including text retrieval, query ranking and evaluation methods, methods and the basics of recommender systems. Grading is based entirely on 4 weekly quizzes comprised of 10 multiple choice questions. You only get 1 attempt on the quizzes.

The weekly content in Text Retrieval and Search Engines consists of around 10 video lectures that range from 5 to 20 minutes followed by a short 10 question quiz. If that sounds like a lot of lecture per question, it is, and there are no in-lecture quizzes to reinforce concepts as you go along. The lectures themselves are definitely a step up from the first course in the specialization, Pattern Discovery in Data Mining. The professor isn't hard to understand this time around and he explains con
Read more
Text Retrieval and Search Engines is the second course in Coursera's new data mining specialization offered by the University of Illinois at Urbana-Champaign. The course covers a variety of topics in text data mining and natural language processing including text retrieval, query ranking and evaluation methods, methods and the basics of recommender systems. Grading is based entirely on 4 weekly quizzes comprised of 10 multiple choice questions. You only get 1 attempt on the quizzes.

The weekly content in Text Retrieval and Search Engines consists of around 10 video lectures that range from 5 to 20 minutes followed by a short 10 question quiz. If that sounds like a lot of lecture per question, it is, and there are no in-lecture quizzes to reinforce concepts as you go along. The lectures themselves are definitely a step up from the first course in the specialization, Pattern Discovery in Data Mining. The professor isn't hard to understand this time around and he explains concepts well enough to grasp them without having to re-watch videos. As with many of Coursera's other 4-week specializations, however, lectures sometimes turn into information dumps where the professor ends up reading off slides. The course does have a C++ programming assignment which was nice to see.

Text Retrieval and Search Engines is a decent course that is worth a look if you are interested in text data mining and search engines. Although the lectures lackluster, they have some good information. If you're planning on getting a verified certificate, it is a good idea to try the practice quizzes before submitting the real one.

I give this course 2.75 out of 5 stars: Fair.
5 people found
this review helpful
Was this review helpful to you? Yes
2.0 12 months ago
by Marianne Cardwell completed this course.
I've taken a number of courses on Coursera and have thoroughly enjoyed some of them, but it's clear that the quality varies. I was very disappointed in this course. Having applied to the University of Illinois' Master of Computer Science - Data Science, I thought it'd be a good idea to take some of their Coursera courses to get a sense of the quality of their education. I probably should have taken their classes first and then applied, saving me the trouble. If this is the type of instruction I can expect in the Masters program, I think I'll save myself the $19k in tuition.

The problems I have with this course are as follows:

- The quizzes for weeks 1 and 4 do not cover the material learned during those weeks. I've pointed it out on their forum and others have pointed it out in their reviews of the course on Coursera. This has not been fixed so they're not maintaining the course.

- I wanted to do the programming exercises to *really* learn somethi
Read more
I've taken a number of courses on Coursera and have thoroughly enjoyed some of them, but it's clear that the quality varies. I was very disappointed in this course. Having applied to the University of Illinois' Master of Computer Science - Data Science, I thought it'd be a good idea to take some of their Coursera courses to get a sense of the quality of their education. I probably should have taken their classes first and then applied, saving me the trouble. If this is the type of instruction I can expect in the Masters program, I think I'll save myself the $19k in tuition.

The problems I have with this course are as follows:

- The quizzes for weeks 1 and 4 do not cover the material learned during those weeks. I've pointed it out on their forum and others have pointed it out in their reviews of the course on Coursera. This has not been fixed so they're not maintaining the course.

- I wanted to do the programming exercises to *really* learn something, not just go through the motions. I tried installing the required software, MeTA on two different Windows computers (W7 & W10) and it wouldn't install on either one. I was not the only one with the same problem but never found a solution. I then got a Linux VM and installed it successfully on there, only to be unable to install the UofI code required for the first assignment. Again, I was not the only one with this problem. For both issues, I posted on the forum. None of the "moderators" or instructors ever responded.

- The instructor was difficult to understand at first. Once you've listened to him for a bit, it gets easier though, so it was only a problem for me during the first couple of videos.

- I thought the instructor took too much time explaining some of the obvious things and too little time explaining the more complex things. More examples would have been very helpful.

I would not recommend paying to take this course as the quizzes aren't particularly useful and you most likely won't be able to get the programming assignments to work. I think this course is emblematic of one of the issues I see on Coursera: too much of a reliance on "peers" to help you.
Was this review helpful to you? Yes
1.0 a year ago
Anonymous is taking this course right now.
I was initially excited for this course as it seemed a good dive into unstructured text data. But now I'd say: *skip this course*. I think the instructor is okay and presents the material in a sufficient enough manner to get a decent grasp of it.

The reason I'd say skip this course is that the exercises are pretty bad. The class is only graded on quizzes and the optional programming assignments use an obscure text mining/analysis tool called MeTA which is time consuming to setup unless you're experienced in navigating the mess that open source C++ libraries are. Once you've set it up, it basically just runs you through a set of contrived steps that don't require any much programming or critical thinking.

To ACTUALLY learn document ranking and text retrieval, you really should have to get your hands dirty in constructing code that will do this, preferably on real world data or a very interesting test data set. And this course does not offer anything near
Read more
I was initially excited for this course as it seemed a good dive into unstructured text data. But now I'd say: *skip this course*. I think the instructor is okay and presents the material in a sufficient enough manner to get a decent grasp of it.

The reason I'd say skip this course is that the exercises are pretty bad. The class is only graded on quizzes and the optional programming assignments use an obscure text mining/analysis tool called MeTA which is time consuming to setup unless you're experienced in navigating the mess that open source C++ libraries are. Once you've set it up, it basically just runs you through a set of contrived steps that don't require any much programming or critical thinking.

To ACTUALLY learn document ranking and text retrieval, you really should have to get your hands dirty in constructing code that will do this, preferably on real world data or a very interesting test data set. And this course does not offer anything near this. I will complete this course, but only because I paid for it.

Thus I don't think most students will get more than a surface-level glance of text retrieval and search-engine-construction & document retrieval from this course.

And thus I'd say skip this course and find a better one. And to the instructors I'd say, add new programming assignments that require students to implement their own systems. Step-step handholding is no way to learn. Also, use tools that are more universal to the data science world.
Was this review helpful to you? Yes
5.0 3 years ago
Anonymous completed this course.
Great class with a nice mix of theoretical and practical lessons. There was a competition at the end of the course which pushed us to come up with new ideas.
1 person found
this review helpful
Was this review helpful to you? Yes
4.0 6 months ago
Anonymous audited this course.
Precise and clear explanation about the concepts .This course completes focuses on text retrieval concepts with strong strong intro on what is text retrieval , what are the challenges faced and further gives an insight on various models and improvement in this field .Therefore, this course is mostly only for people more interested in an area in information retrieval.

Was this review helpful to you? Yes
4.0 11 months ago
Anonymous audited this course.
pretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty goodpretty good
Was this review helpful to you? Yes
2.0 12 months ago
Lien Block completed this course, spending 2 hours a week on it and found the course difficulty to be medium.
The course is not very organised and even though they share a lot of information, it's not really very useful for someone who wants to get his/her hands dirty and really learn NLP/Text retrieval.

(+ Instructor is sometimes very hard to understand)
Was this review helpful to you? Yes
5.0 6 months ago
Anonymous completed this course.
It's not complete, but a good start point for who want to learn more about information retrieval. Great course. I recommend.
Was this review helpful to you? Yes
4.0 2 years ago
by Colin Khein completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
4.0 a year ago
Basil Rormose completed this course.
Was this review helpful to you? Yes
4.0 a year ago
Mike Rocke completed this course.
Was this review helpful to you? Yes
1.0 3 years ago
by Deepak Jois dropped this course and found the course difficulty to be medium.
1 person found
this review helpful
Was this review helpful to you? Yes
3.0 3 years ago
Rafael Prados completed this course.
0 person found
this review helpful
Was this review helpful to you? Yes
  • 1

Class Central

Get personalized course recommendations, track subjects and courses with reminders, and more.

Sign up for free