CS 535 Introduction to Data Mining    Fall 2015


This data mining course introduces the concepts, algorithms, techniques, and applications of data mining. Topics include background of data mining, data preprocessing, classification, clustering, association-rules mining. This course is designed for CS graduate students.

Class Schedule: T R 2:50 PM - 4:15 PM

Classroom:  LH 012

Instructor: Dr. Lei Yu  

TA: Elliot Way

Telephone:  (607) 777-6250


Email: lyu AT cs DOT binghamton DOT edu  

Email:  ellioteway AT  gmail DOT   com

Office Location: Q5, Engineering Building

Office Location: N21

Office Hours: T R 1:30PM - 2:30PM or by appointment

Office Hours: M W Noon-1PM or by appointment


  • CS 333 Algorithms
  • MATH 327 Probability with Statistical Methods


  • Background of knowledge discovery and data mining
  • Data preprocessing  (e.g., data cleaning, transformation, dimensionality reduction, instance selection)
  • Classification (e.g., decision trees, Bayesian classifiers, instance-based classifiers, rule-based classifiers, artificial neural networks, support vector machines, ensembles)
  • Clustering (e.g., K-means, hierarchical clustering, density-based clustering)
  • Mining association rules (e.g., Apriori, FP-growth)

Textbook: (Recommended)

  • Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Addison-Wesley, April 2005.


There will be 4 assignments in the form of written exercises on key concepts and algorithms.


Each student will be required to conduct a term project on a topic approved by the instructor. A student can conduct a project alone or team up with another student. A project proposal will be presented in class, and a final project report is due at the end of the semester.


Each student will be required to give one presentation on a selected topic (a list of topics given by the instructor). A student can present individually or team up with another student on one topic. The presenters will also be responsible for conducting group discussions and answering questions.


There will be several quizzes and two exams in class.


Final grades will be based on quizzes (10%), homework (4 assignments, 20%), project (15%), presentation (15%), Exam I (20%), Exam II (20%).

Academic Integrity:

Discussion of general concepts and questions concerning the homework assignments among students is encouraged. However, each of you is expected to work on the homework solutions on your own. Sharing of any part of solutions is prohibited. If you are unclear about the policy, please consult with the instructor before you act. Suspected cases of academic misconduct will be pursued fully in accordance to the Student Academic Honesty Code of Thomas J. Watson School of Engineering and Applied Science, Binghamton University.

Late Policy:

Each assignment is due at the beginning of class on the due date. Any assignment received within the next 24 hours will be penalized by 20% of the full credit; any assignment received within the time between 24 hours and 48 hours pass the deadline is penalized by 50% of the full credit; No assignment will be accepted after 48 hours pass the deadline. Rare exceptions of this policy may be made at the discretion of the instructor under demonstrably circumstances.