1 / 8

Data Mining

Data Mining. dr Iwona Schab. Semester timetable. ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business, administration, science and technology. 2 The process of discovering knowledge in data; the role of data mining in this process.

lonna
Download Presentation

Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining dr Iwona Schab 27-18 września 2012

  2. Semestertimetable • ORGANIZATIONAL ISSUES, • INDTRODUCTION TO DATA MINING • 1 Sources of data in business, administration, science and technology. • 2 The process of discovering knowledge in data; the role of data mining in this process. • 3 Data mining and Business Intelligence. • 4 SEMMA methodology. • 5 Data preparation: sampling, cleaning, normalization and standardization. • 6 Associationrulesdiscovery. • 7 Classification problems: case studies.

  3. Semestertimetable • 8 Rule induction systems: algorithms, knowledge representation. • 9 Decision trees: partition rules and pruning. • 10 Classification based on probability distributions: naive Bayes estimation and Bayesian networks. • 11 Grouping problems - case studies. • 12 Cluster analysis: combinatorial and hierarchical methods. • 13 Modeling response to direct mail marketing. • 14 Churnanalysis. • 15 Textmining. • 16 Web mining. • 17 Data mining in Life Science. • 18 Comparative analysis of algorithms implemented in SAS Enterprise Miner and WEKA software.

  4. Literature Basic • Paolo Giudici, Applied Data Mining. Statistical Methods for Business and Industry, Wiley, New York 2011 Supplementary • Selectedpapers to be circulated • Daniel T.Larose, Discovering Knowledge in Data: An Introduction to Data Mining, Wiley, New York 2005 • Daniel T.Larose, DataMining Methods and Models, Wiley, New York 2006

  5. Statistical Analysis?

  6. Data Mining • to mine = to extract (e.g. precious, hiddenresources from the Earth) • Differentdefinition and understandingdepending on user • New dysciplinedeveloped from computing and statistics • In-depthsearch to findadditionalinformation (previouslyunnoticed in the mass of data available) • Data preparation and „structuringunstructured” needed • Machine learning = finding relations and regularities in data • Generalisation from the observed data to newunobservedcase

  7. KDD Process(Knowledge Discovery in Database)

  8. Software www.sgh.waw.pl/ogolnouczelniane/ci/aplikacje/oprogramowanie/ • SAS/STAT • SAS Enterprise Miner --- • Other: Statistica, SPSS • WEKA

More Related