240 likes | 360 Views
Data Mining 2 (ex Análisis Inteligente de Datos y Data Mining ) Lluís A. Belanche. www.lsi.upc.edu/... /~belanche/docencia/aiddm/aiddm.html /~avellido/teaching/data_mining.htm. Contents of the course (hopefully). 1. Introduction & methodologies 2. Exploratory DM through visualization
E N D
Data Mining 2 (ex Análisis Inteligente de Datos y Data Mining) Lluís A. Belanche
www.lsi.upc.edu/... /~belanche/docencia/aiddm/aiddm.html /~avellido/teaching/data_mining.htm
Contents of the course (hopefully) • 1. Introduction & methodologies • 2. Exploratory DM through visualization • 3. Pattern recognition: introduction • 4. Pattern recognition: the Gaussian case • 5. Feature extraction • 6. Feature selection & weighing • 7. Error estimation • 8. Linear methods are nice! • 9. Probability in Data Mining • 10. Latency, generativity, manifolds and all that • 11. Application of GTM: from medicine to ecology • 12. DM Case studies Sorry guys! … no fuzzy systems …
Error estimation
Linear classifiers are nice! (II) F Transformation F (x) = [ F1(x), F2(x), … Fm(x) ] with x = [ x1, x2, …, xn ] Useful for “ascending” (m>n) or “descending” (m>n) with 0 < m,n < oo (integers) … an example?
Linear classifiers are nice! (III) F Nets F (x) = [ F1(x), F2(x), … Fm(x) ] with x = [ x1, x2, …, xn ] F(x) x
Utility • This is a very powerful setting • Let us suppose: r>s increase in dimension increase in expressive power, ease the task for almost any learning machine r<s decrease in dimension visualization, compactation, noise reduction, removal of useless information Contradictory !?
On intelligence … • What is Intelligence? • What is the function of Intelligence? to ensure survival in nature • What are the ingredients of intelligence? • Perceive in a changing world • Reason under partial truth • Plan & prioritize under uncertainty • Coordinate different simultaneous tasks • Learn under noisy experiences
Parking a Car (difficult or easy?) “Generally, a car can be parked rather easily because the final position of the car is not specified exactly. It it were specified to within, say, a fraction of a millimeter and a few seconds of arc, it would take hours of maneuvering and precise measurements of distance and angular position to solve the problem.” High precision carries a high cost.
The primordial soup Fuzzy Logic • Belief • Networks Neural Networks Soft Computing Chaos & Fractals Evolutionary Algorithms Rough Sets
What could MACHINE LEARNING possibly be? • In the beginning, there was a set of examples … • To exploit imprecision, uncertainty, robustness, data dependencies, learning and/or optimization ability, to achieve a working solution to a problem which is hard to solve. • To find an exact (approximate) solution to an imprecisely (precisely) formulated problem.
So what is the aim? • The challenge is to put these capabilities into use by devising methods of computation which lead to an acceptable solution at the lowest possible cost. • This should be the guiding principle
Rough Sets RS Different methods = different roles Fuzzy Logic : the algorithms for dealing with imprecision and uncertainty Neural Networks : the machinery for learning and function approximation with noise Evolutionary Algorithms : the algorithms for adaptive search and optimization uncertainty arising from the granularity in the domain of discourse
Examples of soft computing • TSP: 105 cities, • accuracy within 0.75%, 7 months • accuracy within 1%, 2 days • Compare • “absoulute best for sure” with “very good with very high probability”
Are you one of the top guns? • Consider … • Search space of size s • Draw N random samples • What is the probability p that at least one of them is in the top t ? • Answer: p = 1 – (1-t/s)N • Example: s= 1012, N=100.000, t=1.000 1 in 10.000 !
On Algorithms • what is worth? Specialized algorithms: best performance for special problems Generic algorithms: good performance over a wide range of problems Generic Algorithms Efficiency Specialized Algo. P Problems
Words are important ! • What is a theory ? • What is an algorithm ? • What is an implementation ? • What is a model ? • What does “non-linear” mean ? • What does “non-parametric” mean ?
The problem of induction • Classical problem in Philosophy • Example: 1,2,3,4,5,? • A more through example: JT
What are the conditions for successful learning? • Training data (sufficiently) representative • Principle of similarity • Target function within capacity of the learner • Non-dull learning algorithm • Enough computational resources • A correct (or close to) learning bias
And the Oscar goes to … The real problem is not whether machines think, but whether men do. B.F. Skinner, Contingencies of Reinforcement