CSCE 410/810 Information Retrieval

 

Have you googled lately?  Do you want to find out how Google searches?  What about other search engines in general? Have you used WebMD before? AskJeeves? How are these documents indexed so that they can be retrieved accurately and quickly for the users?  What about information that is visual or graphic?  How do we index those items?  How do we search enterprise data?  How do we search genomics data? How do we use information retrieval to filter out spam?

Information retrieval is a topic that has many important application areas: knowledge discovery, data mining, text understanding, ontology representation, bioinformatics, image and video indexing, question-answering, spam filtering, clustering, digital libraries, etc.

The objective of this class is to introduce students to the fundamentals of information retrieval systems.  The course is organized into four stages.  First, the class will start by studying basic concepts in retrieval evaluation, inverted files, lexical operations, and indexing and searching.  In the second stage, the class explores more advanced topics such as thesaurus, query modification, ranking algorithms, and clustering.  The third stage of the class focuses on interdisciplinary research issues such as data mining, knowledge discovery, digital libraries and visual information retrieval.  Finally, the fourth stage of the class will be seminar-oriented, with presentations in the areas of query manipulations, routing and retrieval algorithms (question-answering, spam-filtering, enterprise data search, retrieval in genomics), and machine learning towards improving information retrieval. 

Grading is based on class participation, 4 homework assignments, one examination, a group presentation and a group final project.  The group final project will be based on the 2005 TREC Tracks (Enterprise, Genomics, HARD, Question Answering, HARD, Robust Retrieval, and SPAM).  The number of students enrolled in the class will decide the actual syllabus of the class.   

Readings will be based on the textbook: Baeza-Yates, R. and B. Ribeiro-Neto (1999).  Modern Information Retrieval, New York: Addison-Wesley.

 

! Key Info

 

·         This class counts as one of the Application track topics. 

·         Class times are 2:00 p.m.-3:15 p.m. TR.   Classroom is at AVH 109.

·         Students enrolling in this class should have had CSCE310.  Special permission from the instructor is possible if students have had significant programming experience. 

·         Last offering:  http://www.cse.unl.edu/~lksoh/Classes/CSCE410_810_Fall02/index.html

 

? Contact Info

 

Name:             Prof. Leen-Kiat Soh                          E-mail:            lksoh@cse.unl.edu   

Phone:            (402) 472-6738                                 Office:             256 Avery Hall