330 likes | 544 Views
Machine Learning BITS F464. Navneet Goyal Department of Computer Science, BITS- Pilani , Pilani Campus, India. Introduction. Introduction. Let’s look at these incredible things that humans can do: Identifying a song by just listening to a very small part of it
E N D
Machine LearningBITS F464 NavneetGoyal Department of Computer Science, BITS-Pilani, Pilani Campus, India
Introduction Let’s look at these incredible things that humans can do: • Identifying a song by just listening to a very small part of it • Identifying a movie by looking at a very short clip • Identifying a person • Identifying a person even after you see him after many many years • Recollecting memories • Identifying a person from a distance • Identifying a person by just listening to his/her voice • Identifying a person by his chat/message signature • Our own GPS!
Introduction Ever wondered how we could do all this? • Pattern recognition • Information retrieval Human Brain!! Neurons!! Ever wondered how we can make Machines learn to do all such tasks and that too with the efficiency of Humans?
Machine Learning Humour Source – http://www.kdnuggets.com/2012/12/machine-learning-data-mining-humor.html
Introduction Related Fields • Artificial Intelligence • Statistics • Data Mining
Machine Learning Humour • What is the difference between statistics, machine learning, AI and data mining? • If there are up to 3 variables, it is statistics. • If the problem is NP-complete, it is machine learning. • If the problem is PSPACE-complete, it is AI. • If you don't know what is PSPACE-complete, it is data mining. Source – http://www.kdnuggets.com/2012/12/machine-learning-data-mining-humor.html
What is Machine Learning? • Machines DO Machines LEARN • Shift in paradigm! • Machines can be made to learn! • How and for what purpose? • How? By writing algorithms! • Purpose: Mainly to Predict and to take Decisions!
Types of Learning • Supervised • Unsupervised • Semi-supervised • Reinforced • Active • Deep
Introduction • Zoologists study learning in animals • Psychologists study learning in humans • In this course, we focus on “Learning in Machines” • Course Objective • Study of approaches and algorithms that can make a machine learn
Introduction • Machine Learning • Subarea of AI that is concerned with algorithms/programs that can make a machine learn • Improve automatically with experience • For example- doctors learning from experience • Faculty learning how to control the class and be effective • We all learn from experience Imagine computers learning from medical records and suggesting treatment (automated diagnosis & prescription)
Machine Learning • A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
Interesting Problems • Speech and Hand Writing Recognition • Robotics (training moving robots) • Search Engine (context aware) • Learning to drive autonomous vehicle • Medical Diagnosis • Detecting credit card fraud • Computational Bioinformatics • Game Playing
What is Machine Learning? • To solve a problem, we need an algorithm! • For example: sorting a list of numbers • Input: list of numbers • Output: sorted list of numbers • For some tasks, like filtering spam mails • Input: an email • Output: Y/N • We do not know how to transform Input to Output • Definition of Spam changes with time and from one individual to individual • What to DO? Reference: E Alpaydin’s Machine Learning Book, 2010 (MIT Press)
What is Machine Learning? • Collect lots of emails (both genuine and spam) • “Learn” what constitutes a spam mail (or for that matter a genuine mail) • Learn from DATA!! • For many similar problems, we may not have algorithm(s), but we do have example data (called Training Data) • Ability to process training data has been made possible by advances in computer technology
What is Machine Learning? • Face Recognition!!! • We humans are so good at it!!! • Ever thought how we do it, despite • Different light conditions, pose, hair style, make up, glasses, ageing etc.. • Since we do not know how we do it, we can not write a program to do it • ML is about making inference from a sample
Machine Learning Applications • What kind of data I would require for learning? • Credit card transactions • Face Recognition • Spam filter • Handwriting/Character Recognition
Handwriting Recognition • Task T • recognizing and classifying handwritten words within images • Performance measure P • percent of words correctly classified • Training experience E • a database of handwritten words with given classifications
Pattern Recognition Example • Handwriting Digit Recognition Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Pattern Recognition Example • Handwriting Digit Recognition • Non-trivial problem due to variability in handwriting • What about using handcrafted rules or heuristics for distinguishing the digits based on shapes of strokes? • Not such a good idea!! • Proliferation of rules • Exceptions of rules and so on… • Adopt a ML approach!! Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Pattern Recognition Example • Handwriting Digit Recognition • Each digit represented by a 28x28 pixel image • Can be represented by a vector of 784 real no.s • Objective: to have an algorithm that will take such a vector as input and identify the digit it is representing • Take images of a large no. of digits (N) – training set • Use training set to tune the parameters of an adaptive model • Each digit in the training set has been identified by a target vector t, which represents the identity of the corresp. digit. • Result of running a ML algo. can expressed as a fn. y(x) which takes input a new digit x and outputs a vector y. Vector y is encoded in the same way as t • The form of y(x) is determined through the learning (training) phase Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Pattern Recognition Example • Generalization • The ability to categorize correctly new examples that differ from those in training • Generalization is a central goal in pattern recognition • Preprocessing • Input variables are preprocessed to transform them into some new space of variables where it is hoped that the problem will be easier to solve (see fig.) • Images of digits are translated and scaled so that each digit is contained within a box of fixed size. This reduces variability. • Preprocessing stage is referred to as feature extraction • New test data must be preprocessed using the same steps as training data Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Linear Classifiers in High-Dimensional Spaces Constructed Feature 2 Var1 Var2 Constructed Feature 1 Find function (x) to map to a different space Go back
A word about Preprocessing!! • Preprocessing • Can also speed up computations • For eg.: Face detection in a high resolution video stream • Find useful features that are fast to compute and yet that also preserve useful discriminatory information enabling faces to be distinguished form non-faces • Avg. value of image intensity in a rectangular sub-region can be evaluated extremely efficiently and a set of such features are very effective in fast face detection • Such features are smaller in number than the number of pixels, it is referred to as a form of Dimensionality Reduction • Care must be taken so that important information is not discarded during pre processing Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Curse of Dimensionality!! • Poses serious challenges ! • Important factor influencing the design on pattern recognition techniques • Mixture of oil, water & gas(homogeneous , annular & laminar) • Each data point is a point in a 12-dimensional space. • 100 points along only two dimensions, x6 & x7 • x – predict its class? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Curse of Dimensionality!! • Unlikely that it belongs to the blue class! • Surrounded by lot of red points • Also, many green points nearby • Intuition: identity of the x should be determined strongly by nearby points and less strongly by more distant points • How can we turn this intuition into a learning algorithm? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Curse of Dimensionality!! • Make grid lines! • Use majority voting • Problems?? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Curse of Dimensionality • No. of cells grow exponentially with D • Need exponentially large no. of training data points • Not a good approach for more than a few dimensions!
Curse of Dimensionality • Solutions • Principal Component Analysis • Singular Value Decomposition Brush up your Linear Algebra…