100 likes | 218 Views
Data Mining – A First View. Roiger & Geatz. Definition. Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge contained within a database. Knowledge Discovery in Databases (KDD) is same a data mining.
E N D
Data Mining – A First View Roiger & Geatz
Definition • Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge contained within a database. • Knowledge Discovery in Databases (KDD) is same a data mining. • Knowledge from a data mining session gives us a model or generalization of the data. • Induction-based learning – generalize by observing specifics.
What Can Computer Learn? • Facts • Concepts • Procedures • Principles • Computers are good at learning concepts – concepts are the outputs from a data mining session.
Three Concept Views • Classical view – all concepts have definite defining properties. • Probabilistic view – concepts are represented by properties that are probable of concept members. • Exemplar view –a given instance is determined to be example of a particular concept if the instance is similar enough to set of one or more known examples of that concept.
Supervised Learning • Also known as induction-based supervised concept learning • Attribute-value matrix – table 1.1 • Decision tree
Unsupervised Clustering • Builds models without predefined classes. • Table 1.3. • Example questions.
Data Mining? • Can we clearly define the problem? • Does potentially meaningful data exist? • Does the data contain hidden knowledge? Or is the data factual and useful for reporting purposes only?
Data Mining or Data Query • Shallow knowledge – factual, easily stored and manipulated. SQL is a good tool. • Multidimensional knowledge – is also factual but multidimensional knowledge _ OLAP tools. • Hidden knowledge – patterns and regularities in data – no SQL – data mining algorithms. • Deep knowledge – knowledge in database that can be found only with some direction – current data mining tools are ineffective.
Expert Systems or Data Mining Data Mining: Data – data mining tool – knowledge Expert Systems – Human Expert – Knowledge Engineer – ES building tool – Knowledge
Data Mining Application • Fraud detection • Health care • Business and finance • Scientific applications • Sports and gaming