350 likes | 364 Views
Learn how to design computer systems that improve performance by learning from experience through machine learning. Explore various applications such as spam filtering, face/person recognition, recommendation systems, robotics, natural language processing, biometrics, object recognition, DNA sequencing, financial data mining/prediction, and process mining and optimization.
E N D
Introduction Machine Learning 14/02/2017
Machine Learning How can we design a computer system whose performance improves by learning from experience?
other application areas Biometrics Object recognition on images DNA seqencing Financial data mining/prediction Process mining and optimisation Pattern Classification, Chapter 1
10 Rule-based systems vs. Machine learning • Domain expert is needed for • writing rules OR • giving training sample • Which one is better? • Can the expert design rule-based systems? • Is the problem specific or general?
Most of the materials in these slides were taken fromPattern Classification (2nd ed)by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000with the permission of the authors and the publisher
14 Definition Machine Learning (Mitchell): „a computer program said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”
Example Classify fishes see bass Classes salmon Goal: to learn a modell from training data which can categorise fishes (eg. salmons are shorter) 15 Pattern Classification, Chapter 1
Supervised learning: Based on training examples (E), learn a modell which works fine on previously unseen examples. Classification: a supervised learning task of categorisation of entities into predefined set of classes 16 Classification(T) Pattern Classification, Chapter 1
17 Pattern Classification, Chapter 1
Basic definitions Feature (or attribute) Instance (or entity, sample) Class label
Image processing steps E.g segmentation of fish contour and background Feature extraction Extraction of features/attributes from images which are atomic variables Typically numerical or categorical 19 Example - Preprocessing Pattern Classification, Chapter 1
length lightness width number of paddles position of mouth 20 Example features Pattern Classification, Chapter 1
21 Length is a weak discriminator of fish types. Pattern Classification, Chapter 1
22 Lightness is better Pattern Classification, Chapter 1
most simple: accuracy (correctrate) False positive/negative errors E.g. if the threshold is decreased the number of sea basses falsly classified to salmon decreases Decision theory 23 Performance evaluation (P) Pattern Classification, Chapter 1
A vector of features describing a particular instance. InstanceAxT = [x1, x2] 24 Feature vector Lightness Width Pattern Classification, Chapter 1
25 Pattern Classification, Chapter 1
Be careful by adding to many features noisy features (eg. measurement errors) Unnecessary (pl. information content is similar to other feature) We need features which might have discriminative power. Feature set engineering is highly task-specific! 26 Feature space Pattern Classification, Chapter 1
27 This is not ideal. Remember supervised learning principle! Pattern Classification, Chapter 1
28 Pattern Classification, Chapter 1
Number of features? Complexity of the task? Classifier speed? Task and data-dependent! 29 Modell selection Pattern Classification, Chapter 1
The machine learning lifecycle Data preparation Feature engineering Modell selection Modell training Performance evaluation 30 Pattern Classification, Chapter 1
Do we know whether we collected enough and representative sample for training a system? 31 Data preparation Pattern Classification, Chapter 1
These topics are the foci of this course Investigate the data for modell selection! No free lunch! 32 Modell selection and training Pattern Classification, Chapter 1
There are various evaluation metrics Simulation of supervised learning: split your data into two parts train your modell on the training set predict and evaluate your modell on the test set (unknow during training) 33 Performance evaluation Pattern Classification, Chapter 1
34 Topics of the course • Classification • Regression • Clustering • Recommendation systems • Learning to rank • Structure prediction • Reinforcement learning
https://www.kaggle.com/competitions http://www195.pair.com/mik3hall/weka_kaggle.html