CPEG 457/657: Search and Data Mining (Spring 2020)
Time & Place: MWF 9:05-9:55am, ISE 207
Announcements
- Welcome to the class!
- Future annoucements, assignments and lecture notes will be available at Canvas !
Basic Information
- Time: MWF 9:05-9:55am
- Place: ISE207
- Instructor: Hui Fang (hfang AT udel.edu)
- Office: 201B Evans Hall
- Office hours: by appointment
- TA: Kuang Lu (lukuang AT udel.edu)
- Office: 209 Evans Hall
- Office hours: 4:00pm-5:00pm on Wednesdays or by appointment
Course Description
With the increasing amount of textual information, it is important to develop effective search engines, such as Google, to help users manage and exploit the information. The course is designed to give students a broad view of information retrieval and to give students hands-on experience to solve real world problems in the area of information retrieval and text mining.
Prerequisites
Students should come with GOOD programming skills.
Textbooks
- Preferred: Text Data Management and Analysis: A Practical Introduction
to Information Retrieval and Text Mining, by ChengXiang Zhai and Sean Massung.
- Optional: Introduction to Information Retrieval, by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schutze. Available free online
( link ).
Grading Policy
- Class participation: 15%
- Assignments: 25%
- Midterm exams: 20%
- Course Project: 40%
Regrade requests should be submitted in writing within one week after the assignment or exam in question is returned.
Late Assignment Policy
Late submissions will be penalized on an hourly scheme.
Up to 6 hour late, -15%
Up to 12 hours late, -40%
Up to 18 hours late, -70%
Zero grade after 24 hours
Topics Covered
- Overview and Background
- Motivation
- Basic probability and statistics
- Basic Information theory
- Basic Natural Language Processing
- Search
- Indexing
- Implementation
- Evaluation
- Retrieval models
- Web search
- Link Analysis
- Learning to rank
- Mining
- Word Association Mining
- Text Categorization
- Text Clustering
- Probabilistic Topic Modeling