1 / 33

Understanding Data Mining

Understanding Data Mining. Craig A. Stevens, PMP, CC craigastevens@westbrookstevens.com www.westbrookstevens.com. Examples of Classical Statistical Methods. Latitude 36.19N and Longitude -86.78W. Nashville, TN, USA. Y i = a + bx i + e. Multiple Regression.

Download Presentation

Understanding Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Understanding Data Mining Craig A. Stevens, PMP, CC craigastevens@westbrookstevens.com www.westbrookstevens.com

  2. Examples of Classical Statistical Methods

  3. Latitude 36.19N and Longitude -86.78W Nashville, TN, USA

  4. Yi = a + bxi + e

  5. Multiple Regression http://www.ats.ucla.edu/stat/sas/faq/spplot/reg_int_cont.htm

  6. Multiple Regression http://www.ats.ucla.edu/stat/sas/faq/spplot/reg_int_cont.htm

  7. Multiple Regression http://www.ats.ucla.edu/stat/sas/faq/spplot/reg_int_cont.htm

  8. Multiple Regression http://www.ats.ucla.edu/stat/sas/faq/spplot/reg_int_cont.htm

  9. Multiple Regression http://www.ats.ucla.edu/stat/sas/faq/spplot/reg_int_cont.htm

  10. Data Mining

  11. http://datamining.typepad.com/photos/uncategorized/livejournal.pnghttp://datamining.typepad.com/photos/uncategorized/livejournal.png

  12. What is Data Mining? • The process of identifying hidden patterns, trends, and relationships in large quantities of data. Why Do Data Mining? • To discover useful information for making decisions. • Too many variables for Classical Statistical methods to work. • Large Number of Records 108 - 1012 • Gigabyte – Terabyte • High Dimensional Data • Lots of Variables (10 – 104 attributes)

  13. The Huber-Wegman Taxonomy of Data Set Sizes

  14. SAS Enterprise Miner Objects

  15. Shows the Cut off Point is 6 Variables

  16. Small Number of Useful Variables

  17. Comparing Methods and Profit vs Marketing Cost

  18. Decision Trees for Predictive Modeling Padraic G. Neville SAS Institute Inc. 4 August 1999

  19. Clustering As in Different Brands

  20. Data Mining Art found at http://datamining.typepad.com/data_mining/dataviz/page/2/

  21. Data Mining Art found at http://datamining.typepad.com/data_mining/dataviz/page/2/

  22. National Energy Research Scientific Computing Center

  23. SurfStat A Matlab toolbox for the statistical analysis of univariate and multivariate surface and volumetric data using linear mixed effects models and random field theory Keith J. Worsley

  24. Latitude 36.19N and Longitude -86.78W Nashville, TN, USA

  25. Genealogical Tree On You Tube http://www.youtube.com/watch?v=CnniJR5Ah7g

More Related