CISC 483-683 Data Mining
Home Page Fall 2017

Syllabus Homework Assignments Public Data Repositories Data-sets


Data Mining attempts to identify interesting structural patterns in large data sets that can be used to make future predictions. For example, in the area of security, one might analyze a database of past credit card transactions to predict what sequences are indicative of fraudulent credit card use, and then reject credit card transactions that match this pattern. In the area of medical diagnosis, one might analyze patient histories to determine which patients are most likely to benefit from an expensive procedure. In the life science area, molecular biologists might analyze large sets of biological data to predict protein structure. In the area of consumer marketing, one might analyze supermarket data to determine what items are typically purchased with other items, and then display those items together to encourage more customers to purchase both items. And in the area of investment and finance, one might analyze economic data to identify stock market trends. Data mining is becoming increasingly important in many environments; a few of these include bioinformatics, advertising, banking, business, finance, security, medicine, and web page design, but there are many others.

This course will introduce fundamental strategies and methodologies for data mining along with the concepts underlying them, and will provide hands-on experience with a variety of different techniques. Students will learn to use the Weka workbench, a set of data mining tools. The undergraduate version, CISC-483, has been approved as a technical elective for undergraduate computer science majors.

Instructor: Sandra Carberry
Office: 439 Smith
Office Hours: Mon. 1:45pm-3:30pm; Wed. 8:30am-9:30am

TA: Matt Saponaro
Office: 201 Smith
Office Hours: Tues. 10:30am-11:30am; Thurs. 1:00pm-2:00pm