320 likes | 628 Views
Pattern Recognition & Machine Learning. Debrup Chakraborty debrup@delta.cs.cinvestav.mx. The Initials. Time: Wednesday: 16 hrs to 18 hrs Friday : 16 hrs to 18 hrs. Course website: http://delta.cs.cinvestav.mx/ ~debrup/Machine_Learning.html. Books:
E N D
Pattern Recognition & Machine Learning Debrup Chakraborty debrup@delta.cs.cinvestav.mx
The Initials Time: Wednesday: 16 hrs to 18 hrs Friday : 16 hrs to 18 hrs Course website: http://delta.cs.cinvestav.mx/~debrup/Machine_Learning.html • Books: • Pattern Classification: Duda, Hart and Stork • Machine Learning: Mitchell • Neural Networks : Haykin
The Initials (contd.) Grading policies: 4 homeworks (20%) 2 exams (30%) 1 project/term paper ( 50%) The lectures would be in English, I am really sorry about it.
We are good at recognizing patterns We can recognize faces with ease We can understand spoken words We can read handwritings ……… many more It would be nice to make machines perform these tasks. Informally pattern recognition deals with techniques and methods to make machines recognize patterns.
Satellite image of Kolkata, in 4 channels: Red, Green, Blue and Infrared Which parts are land and which are water? Where is the airport?
Protein Fold Prediction • Proteins are sequences of amino acid molecules. There are 20 distinct type of amino acids and their sequences can form many-many protein molecules. • A amino acid chain must fold in a certain manner which helps it in its specified activity. • The activity and function of a protein depends on the way it folds. Thus the fold information is necessary for determining the function of a protein. • Problem: Given an amino acid sequence, tell me the fold it will undergo.
Text Categorization Given a large corpus of text documents (Say news items). Tell me the categories of the documents. Similar problems: Classify music in a music corpus Classify images in a image corpus Retrive documents/music/images from a database which are similar to x.
Other Problems • Predict whether a patient hospitalized due to heart attack will have a second heart attack. The prediction is to be based on demographic, diet and clinical measurements of the patient. • Predict the price of a stock in 6 months from now on the basis of company performance measures and econimic data. • Identify the numbers in a handwritten zip code (CP) from digitized images. • Estimate the amount of glucose in the blood of a diabetic patient from the infra red absorption spectrum of the blood. • Estimate the risk factors for prostrate cancer based on clinical and demographic variables.
Pattern Recognition (What the experts say?) Duda Hart, 1973 : “A field concerned with machine recognition of meaningful regularities”. Bezdek, 1981: “Pattern recognition is the search for structure in data”
Wt Ht Co legs 20 4 b 4 Types of Data Object Data Relational Data
Object Data – Numeric Features The characteristics of an object is encoded in a vector called the Feature Vector Each component of the vector represents some attribute of the object. These components are called features. Example: Iris data : A data in R4. Representing iris flowers. The features are the sepal length, sepal width, petal length and petal width of 150 iris flowers of 3 different types. Multichannel satellite Image: Images captured by different sensors which captures the frequency information of the electromagnetic radiation from earths surface.
Relational Data A relational matrix whose values are assigned by humans or computed from features: Russian German Chinese Japanese Russian German Chinese Japanese
Pattern Recognition Systems Preprocessing Feature extraction Feature analysis The main recognition task
Recognition Involves Learning • Learning in this context means: • Extracting knowledge from past experience • Representing the knowledge efficiently • Using the knowledge for future predictions/recognitions
Types of Learning Supervised learning Learning with a teacher Unsupervised learning Learning without a teacher Reinforcement learning Learning with a critic
Supervised Learning Training Set Learning Algorithm New example h Prediction
Supervised Learning (Contd.) Supervised learning systems varies according to the type of function learned. Basic types: Function approximation systems (Numeric outputs) Classifier systems (outputs are classes)
Supervised Learning (Contd.) It is assumed that x and y bear a (unknown) functional relationship say It is assumed that (xi, yi) is generated from a fixed (but unknown) time invariant probability distribution. The Training set Goal: To find h which closely resembles S given L.
Supervised Learning (Contd.) How to measure whether h resembles S. Training error: The error on training data points Test error (Generalization Error): The error on points not in the training set (difficult to measure)
Bad Generalization – An example y y x x y y x x