120 likes | 208 Views
SAK 5609 DATA MINING. Prof. Madya Dr. Md. Nasir bin Sulaiman nasir@fsktm.upm.edu.my 03-89466514. Synopsis. Kredit: 3(3+0) Contact hours: 3 x 1 hour per week Semester: I
E N D
SAK 5609DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman nasir@fsktm.upm.edu.my 03-89466514
Synopsis • Kredit: 3(3+0) • Contact hours: 3 x 1 hour per week • Semester: I • Emphasis on concepts of data mining. It includes principles of data mining, data mining functions, data mining processes, data mining techniques such as K-nearest neighbour and clustering algorithms, rule induction, decision tree algorithms, association rule mining, neural networks and genetic algorithms; and data mining examples. Industrial and scientific applications will be given.
Assessment & References • Assessment: • Exercises (10%) • Project I (15%) + presentation I (5%) Week 7 Project II (15%) + presentation II (5%) Week 14 • Mid-exam 20% (1 hour) Week 6 • Final exam 30% (1.5 hours) Week 15 - 17 • References: • Jiawei Han & Micheline Kamber, (2006), “Data Mining: Concepts and Techniques”, 2nd. Ed., Morgan Kaufman. • Michael J.A.Berry & Gordon S. Linoff, (2004), “Data Mining Techniques (2nd edition)”, Wiley. • Other related articles
Course Contents • Chapter 1 Introduction • Motivation • Origin of data mining • What it is/ isn’t • The KDD process • Types of data
Chapter 2 Data mining tasks • Classification • Association rule mining • Sequential pattern mining • Clustering • Anomaly detection
Chapter 3 Data issues • What is data set? • Types of attributes • Transformation for different types • Types of data • Structured data, record data, data matrix, document data, transaction data, graph data, ordered data • Data quality • Noise and outliers, missing values, inconsistent/duplicate data
Chapter 4 Data preprocessing • Why Data Preprocessing? • Why Is Data Preprocessing Important? • Major Tasks in Data Preprocessing • Data Cleaning • Data integration • Data transformation • Data reduction • Data discretization
Chapter 5 Association rule mining • Introduction • The Model • Goal and Key Features • Mining Algorithms • Problems with the Association Rule Model • Issues of association rules • Other Main Works on Association Rules
Chapter 6 Sequential Pattern Mining • Sequence databases and pattern analysis • Mining algorithms • Challenges on sequential mining • Studies on sequential mining
Chapter 7 Classification and Prediction • Classification Model • General Approach • Classification—A Two-Step Process • Classification Techniques • Evaluating classification methods • Decision Tree Based Classification, rule based classifiers, nearest neighbor classifiers etc
Chapter 8 Clustering and Anomaly • What is/is not cluster analysis? • Examples of clustering applications • Types of data in clustering analysis • Types of clustering – hierarchical, partitional • Major Clustering Techniques • Approaches to anomaly detection • Issues dealing with anomalies