1 / 19

Overview of Data Mining Technology

Learn about data mining, the process of discovering patterns and rules from vast amounts of data. Explore the goals of data mining, including prediction, identification, classification, and optimization. Discover association rules, classification algorithms, clustering, and various approaches to other data mining problems. Understand the applications of data mining in marketing, finance, manufacturing, healthcare, and more. Explore commercial data mining tools and the potential of this field. (476 characters)

edithmiller
Download Presentation

Overview of Data Mining Technology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 27 Data Mining Concepts

  2. Overview of Data Mining Technology • Data Mining aka Knowledge Discovery in Databases (KDD) • Discovery of new information in terms of patterns or rules from vast amounts of data • Must be carried out efficiently on large files and databases

  3. Goals of Data Mining • Prediction • Show how certain attributes will behave in future • Identification • Identify existance of an item • Classification • Partition data into different categories • Optimization • Limited resources such as time, space, money

  4. FIGURE 27.1Example transactions in market-basket model.

  5. FIGURE 27.2FP-tree and item header table.

  6. FIGURE 27.3Taxonomy of items in a supermarket.

  7. FIGURE 27.4Simple hierarchy of soft drinks and chips.

  8. Association Rules • Market-Basket Model, Support, and Confidence • Apriori Algorithm • Sampling Algorithm • Frequent-Pattern Tree Algorithm • Partition Algorithm • Other Types of Association Rules • Additional Considerations for Association Rules

  9. Classification • The process of learning a model that describes different classes of data. • The classes are known in advance – the rules that describe them are not. • Mining can help determine past influential characteristics that can be used to predict future behavior.

  10. FIGURE 27.5Example decision tree for credit card applications.

  11. FIGURE 27.6Sample training data for classification algorithm.

  12. FIGURE 27.7Decision tree based on sample training data where the leaf nodes are represented by a set of RIDs of the partitioned records.

  13. Clustering • Another way of learning • Puts “similar” records into groups • Reaction to medication • Similarity function is key

  14. FIGURE 27.8Sample 2-dimensional records for clustering example (the RID column is not considered).

  15. Approaches to Other Data Mining Problems • Discovery of Sequential Patterns • Discovery of Patterns in Time Series • Regression • Neural Networks • Genetic Algorithm

  16. Applications of Data Mining • Marketing • Finance • Manufacturing • Health Care • Probably many other decision-making contexts

  17. Commercial Data Mining Tools • Text lists several packages and their strengths • Huge field as databases multiply • Big potential if you can come up with a way of protecting privacy as well as correcting data.

  18. Summary • Lots of potential in this field • Seems complex, but only because of the sheer amount of data. • See Wikipedia at • http://en.wikipedia.org/wiki/Data_mining

More Related