1 / 12

SAK 5609 DATA MINING

SAK 5609 DATA MINING. Prof. Madya Dr. Md. Nasir bin Sulaiman nasir@fsktm.upm.edu.my 03-89466514. Synopsis. Kredit: 3(3+0) Contact hours: 3 x 1 hour per week Semester: I

Download Presentation

SAK 5609 DATA MINING

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAK 5609DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman nasir@fsktm.upm.edu.my 03-89466514

  2. Synopsis • Kredit: 3(3+0) • Contact hours: 3 x 1 hour per week • Semester: I • Emphasis on concepts of data mining. It includes principles of data mining, data mining functions, data mining processes, data mining techniques such as K-nearest neighbour and clustering algorithms, rule induction, decision tree algorithms, association rule mining, neural networks and genetic algorithms; and data mining examples. Industrial and scientific applications will be given.

  3. Assessment & References • Assessment: • Exercises (10%) • Project I (15%) + presentation I (5%) Week 7 Project II (15%) + presentation II (5%) Week 14 • Mid-exam 20% (1 hour) Week 6 • Final exam 30% (1.5 hours) Week 15 - 17 • References: • Jiawei Han & Micheline Kamber, (2006), “Data Mining: Concepts and Techniques”, 2nd. Ed., Morgan Kaufman. • Michael J.A.Berry & Gordon S. Linoff, (2004), “Data Mining Techniques (2nd edition)”, Wiley. • Other related articles

  4. Course Contents • Chapter 1 Introduction • Motivation • Origin of data mining • What it is/ isn’t • The KDD process • Types of data

  5. Chapter 2 Data mining tasks • Classification • Association rule mining • Sequential pattern mining • Clustering • Anomaly detection

  6. Chapter 3 Data issues • What is data set? • Types of attributes • Transformation for different types • Types of data • Structured data, record data, data matrix, document data, transaction data, graph data, ordered data • Data quality • Noise and outliers, missing values, inconsistent/duplicate data

  7. Chapter 4 Data preprocessing • Why Data Preprocessing? • Why Is Data Preprocessing Important? • Major Tasks in Data Preprocessing • Data Cleaning • Data integration • Data transformation • Data reduction • Data discretization

  8. Chapter 5 Association rule mining • Introduction • The Model • Goal and Key Features • Mining Algorithms • Problems with the Association Rule Model • Issues of association rules • Other Main Works on Association Rules

  9. Chapter 6 Sequential Pattern Mining • Sequence databases and pattern analysis • Mining algorithms • Challenges on sequential mining • Studies on sequential mining

  10. Chapter 7 Classification and Prediction • Classification Model • General Approach • Classification—A Two-Step Process • Classification Techniques • Evaluating classification methods • Decision Tree Based Classification, rule based classifiers, nearest neighbor classifiers etc

  11. Chapter 8 Clustering and Anomaly • What is/is not cluster analysis? • Examples of clustering applications • Types of data in clustering analysis • Types of clustering – hierarchical, partitional • Major Clustering Techniques • Approaches to anomaly detection • Issues dealing with anomalies

  12. Chapter 9 Data Mining Applications

More Related