Course Title: CS 5604: Information Storage and Retrieval (Fall 2021)

Instructor: Ismini Lourentzou

Teaching Assistant: Makanjuola Ogunleye

Meeting time: Mondays and Wednesdays 5:30-6:45 PM EST, Torgersen Hall 1020

Instructor Office hours: Mondays 4:00-5:00 PM EST, on Zoom

TA Office Hours: Mondays 10:00-11:00 AM and Wednesdays 3:00-4:00 PM EST on Zoom

Course Description:

Welcome to CS 5604, ever wondered how Google or other search engines work? How we can retrieve relevant information from large-scale collections of documents, videos, news articles, tweets and forum posts in just a few seconds? In this course, students will learn the basics of information storage and retrieval, as well as explore cutting-edge research trends. The expected outcome is for students to gain understanding and hands-on experience of the underlying technologies used in modern information retrieval systems. We will cover algorithms and design of search engines and implement our own retrieval models. Other topics include text mining and analysis, indexing, query understanding and expansion, retrieval models (vector space, probabilistic, learning-to-rank, etc.), evaluation and feedback, recommender systems and personalization. We will also explore applications and recent research trends IN information storage and retrieval systems.

Prerequisites:

Programming experience with at least one programming language (Python is recommended and most likely will be used for programming assignments), bash scripting, Linux operating systems usage will be helpful. Check out these video lectures with recommended tools and tips to know. Familiarity with basic math concepts (linear algebra, statistics and probability) will be needed. Any prior experience with machine learning, data analytics and natural language processing is a plus, however, all necessary concepts will be re-introduced in this course. The most important component for successful completion is to extract key concepts and ideas from reading conference papers, be curious when implementing your IR systems, ask questions and participate in class discussions.

Textbooks:

The official textbook for this course is Introduction to Information Retrieval by C. Manning, P. Raghavan, and H. Schütze (Cambridge University Press, 2008). There exist several other good textbooks, for example:

For recent cutting-edge research trends, we will also look at recent publications in IR conferences, e.g., SIGIR, The Web Conference, ICTIR, ECIR, CIKM, WSDM, etc.

Topics (tentative and subject to change):

Assignments:

There are no exams for this course. Grading is based on hands-on assignments and projects. The grading policy is tentative and subject to change, as students will have the opportunity to provide feedback for the grading components and respective percentages.

Notes:

Honor Code Statement:

All assignments submitted shall be considered “graded work” and all aspects of your coursework are covered by the Honor Code. Students enrolled in this course are responsible for abiding by the Honor Code. The Academic Integrity expectations for Hokies are the same in an online class as they are in an in-person class. Hokies are expected to meet the academic integrity standards at Virginia Tech at all times. For additional information about the Honor Code, please visit https://www.honorsystem.vt.edu/ and read the Graduate Honor System Constitution. Ignorance of the rules does not exclude any member of the University community from the requirements and expectations of the Honor Code. In this class, you must attribute appropriate credit to existing ideas, facts, methods and external sources of code by citing the source. At all times, you should avoid claiming someone else’s work as your own. Whenever I learn that a student has violated the honor code, I am obligated to appropriately report the violation. A student who has doubts about how the Honor Code applies to this course should obtain specific guidance from the course instructor well in advance homework submission.

COVID-19 Classroom Conduct:

Virginia Tech is committed to protecting the health and safety of all members of its community. By participating in this class, all students agree to abide by the Virginia Tech Wellness principles and the guidance stated in the Fall 2021 plans. To adhere to these, you must do the following in this class:

Masks may be reusable or homemade cloth masks, dust masks, or surgical masks and should fit close to the face to provide thorough filtration of breathed air. Face shields that are open around the sides do not satisfy this requirement and are currently not accepted as a viable alternative by the university. If a student feels that they cannot wear a mask for health concerns and must use an alternative form of face covering such as a face shield, they should contact Services for Students with Disabilities to request an accommodation. No exceptions for masks will be provided unless there is an official accommodation notice provided by SSD to the instructor. These requirements will not be waived. The instructor has the authority to terminate the class session early if the health and safety requirements are not maintained. Students who fail to follow the requirements will be reported to the Office of Student Conduct. If a student will miss significant class activities because of the need to self-isolate, then the Dean of Students Office should be contacted for an official absence verification. Prolonged absences may be difficult to make-up. Students should consult with their advisor about possible options if too much course work is missed to feasibly make-up. As pandemic conditions continue to evolve through the semester, these requirements may need to change. The guidance posted by the university at VT Ready should represent the most up-to-date requirements of the university and should be checked periodically for changes.