CS 535 Introduction to Data Mining     Fall 2008


This data mining course introduces concepts, algorithms, techniques, and applications of data mining. Topics include background of data mining, data preprocessing, classification, clustering, association-rules mining. This course is designed for CS graduate students.

Class Schedule: T TH 4:25 PM - 5:50 PM

Classroom:  EB G7

Instructor: Dr. Lei Yu  

TA: Yue Han

Telephone:  (607) 777-6250


Email: lyu AT cs DOT binghamton DOT edu  

Email: yhan1 AT binghamton DOT edu

Office Location: G16, Engineering Building

Office Location: N1, Engineering Building

Office Hours: T TH 12:30PM - 1:30PM or by appointment

Office Hours: T TH 2:00PM - 3:00PM


  • Required courses: CS 333 (Algorithms) and MATH 327 (Probability with Statistical Methods), or equivalents
  • Programming: course projects can be implemented in any popular programming languages, such as C, C++, or Java. No programming-specific issues will be covered in this course.


  • Background of knowledge discovery and data mining
  • Data preprocessing  (e.g., data cleaning, transformation, dimensionality reduction, instance selection)
  • Classification (e.g., decision trees, rule-based classifiers, Bayesian classifiers, instance-based classifiers, support vector machines)
  • Clustering (e.g., K-means, hierarchical clustering, density-based clustering)
  • Mining association rules (e.g., Apriori, FP-growth)


  • Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Addison-Wesley, April 2005.


There will be 4 written assignments.


There will be a group (of two or three students) project involving implementation of decision tree algorithm and standard model selection procedure.


There will be several quizzes and two exams in class.


Final grades will be based on quiz (10%), homework (4 assignments, 20%), Exam I (25%), Exam II (25%), project (20%).

Academic Integrity:

Discussion of general concepts and questions concerning the homework assignments among students is encouraged. However, each of you is expected to work on the homework solutions on your own. Sharing of any part of solutions is prohibited. If you are unclear about the policy, please consult with the instructor before you act. Suspected cases of academic misconduct will be pursued fully in accordance to the Student Academic Honesty Code of Binghamton University.

Late Policy:

Each assignment is due at the beginning of class on the due date. Any assignment received within the next 24 hours will be penalized by 20% of the full credit; any assignment received within the time between 24 hours and 48 hours pass the deadline is penalized by 50% of the full credit; No assignment will be accepted after 48 hours pass the deadline. Rare exceptions of this policy may be made at the discretion of the instructor under demonstrably circumstances.

Last updated on 09/02/2008