470 likes | 599 Views
Getting a Machine to Fly Learn Extending Our Reach Beyond Our Grasp. Daniel L. Silver Acadia University, Wolfville, NS, Canada danny.silver@acadiau.ca. Key Take Away. A major challenge in artificial intelligence has been how to develop common background knowledge
E N D
Getting a Machine to Fly LearnExtending Our Reach Beyond Our Grasp Daniel L. Silver Acadia University, Wolfville, NS, Canada danny.silver@acadiau.ca
Key Take Away • A major challenge in artificial intelligence has been how to develop common background knowledge • Machine learning systems are beginning to make head-way in this area • Taking first steps to capture knowledge that can be used for future learning, reasoning, etc.
Outline • Learning – What is it? • History of Machine Learning • Framework and Methods • ML Application Areas • Recent and Future Advances • Challenges and Open Questions
What is Learning? • Animals and Humans • Learn using new experiences and prior knowledge • Retain new knowledge from what is learned • Repeat starting at 1. • Essential to our survival and thriving
What is Learning?(A little more formally) • Inductive inference/modeling • Developing a general model/hypothesis from examples • Objective is to achieve good generalization for making estimates/predictions • It’s like … Fitting a curve to data • Also considered modeling the data • Statistical modeling
What is Learning? • Generalization through learning is not possible without an inductive bias = a heuristic beyond the data
Inductive Bias Human learners use Inductive Bias ASH ST FIR ST SEC OND THI RD ELM ST • Inductive bias depends upon: • Having prior knowledge • Selection of most related knowledge PINE ST OAK ST
What is Learning? • Requires an inductive bias = a heuristic beyond the data • Do you know any inductive biases? • How do you choose which to use?
Inductive Biases • Universal heuristics - Occam’s Razor • Knowledge of intended use – Medical diagnosis • Knowledge of the source - Teacher • Knowledge of the task domain • Analogy with previously learned tasks Tom Mitchell, 1980
What is Machine Learning? • The study of how to build computer programs that: • Improve with experience • Generalize from examples • Self-program, to some extent
History of Machine Learning Origins Promise Hiatus Exploration Renaissance AI Success Advances PDP Group Multi-layer Perceptrons, New apps Minsky & Papert paper, Research wanes Genetic alg, Version spaces, Decision Trees Data mining, Web mining, User models, New alg., Google William James, Neuronal learning Big Data, Web Analytics, Parallel alg., Cloud comp., Deep learning Donald Hebb, Math models, The Perceptron Limited value Present 1890 1940 1950 1960 1970 1980 1990 2000
Of Interest to Several Disciplines • Computer Science – theory of computation, new algorithms • Math - advances in statistics, information theory • Psychology – as models for human learning, knowledge acquisition and retention • Biology – how does a nervous system learn • Physics – analogy to physical systems • Philosophy – epistemology, knowledge acquisition • Application Domains – new knowledge extracted from data, solutions to unsolved problems
Classes of ML Methods • Supervised – Develops models that predict the value of one variable from one or more others: • Artifical Neural Networks, Inductive Decision Trees, Genetic Algorithms, k-Nearest Neighbour, Bayesian Networks, Support Vectors Machines • Unsupervised – Generates groups or clusters of data that share similar features • K-Means, Self-organizing Feature Maps • Reinforcement Learning – Develops models from the results of a final outcome; eg. win/loss of game • TD-learning, Q-learning (related to Markov Decision Processes) • Hybrids – eg. semi-supervised learning
f(x) x x2 x1 Focus: Supervised Learning • Function approximation (curve fitting) • Classification (concept learning, pattern recognition) A B
Supervised Machine Learning Framework Testing Examples Instance Space X (x, f(x)) Model of Classifier h Inductive Learning System Training Examples h(x) ~ f(x)
Supervised Machine Learning • Problem: We wish to learn to classifying two people (A and B) based on their keyboard typing. • Approach: • Acquire lots of typing examples from each person • Extract relevant features - representation! • M = number of mistakes • T = typing time • Transform feature representation as needed • Use an algorithm to fit a model to the data - search! • Test the model on an independent set of examples of typing from each person
Classification 1 Y 0 Logistic Regression Y=f(M,T) Y B B B B B B B B B B M T B A B A B B Mistakes B B B A B A B B A A B A A A B A B B B B A A A A A A B A A B A B A Typing Speed
Classification Artificial Neural Network Y B B … B B B B B B B B B A B A B B M T Mistakes B B B A B A B B A A B A A A B A B B B B A A A A A A B A A A B A B A Typing Speed
Classification Root M? T? T? Inductive Decision Tree Leaf A B B B B B B B B B B B B A B A B B Mistakes B B B A B A B B A A B A A A B A B B B B A A A A A A B A A A B A B A Typing Speed Blood Pressure Example
Application Areas Data Mining: • Science and medicine: prediction, diagnosis, pattern recognition, forecasting • Manufacturing: process modeling and analysis • Marketing and Sales: targeted marketing, segmentation • Finance: portfolio trading, investment support • Banking & Insurance: credit and policy approval • Security: bomb, iceberg, fraud detection • Engineering: dynamic load shedding, pattern recognition
Application Areas • Web mining – information filtering and classification, social media predictive modeling • User Modeling – adaptive user interfaces, speech/gesture recognition • Intelligent Personal Agents – email spam filtering, fashion consultant, • Robotics – image recognition, adaptive control, autonomous vehicles (space, under-sea) • Military/Defense – target acquisition and classification, tactical recommendations, cyber attack detection
Recent and Future Advances • Robotics • Neuroprosthetics • Lifelong Machine Learning • Deep Learning Architectures • ML and Growing Computing Power • NELL – Never-Ending Language Learner • Cloud-based Machine Learning
OASIS: Onboard Autonomous Science Investigation System • Since early 2000’s • Goal: To evaluate, and autonomously act upon, science data gathered by spacecraft • Including planetary landers and rovers
DARPA Grand Challenge 2005 • Stanford’s Sebastian Thrun holds a $2M check on top of Stanley, a robotic Volkswagen Touareg R5 • 212 km autonomus vehicle race, Nevada • Stanley completed in 6h 54m • Four other teams also finished Source: Associated Press – Saturday, Oct 8, 2005
Autonomous Underwater Vehicles Arctic Explorer • AUV designed and built by International Submarine Engineering Ltd. (ISE) of Port Coquitlam, B.C. • Used to map the sea floor underneath the Arctic ice shelf in support of Canadian land claims under the UN Convention on the Law of the Sea. • Various military uses; e.g. mine detection, elimination (Source: ISE, Mae Seto)
Literally Extending Our Reach – Neuroprosthetic Decoders • Dec, 2012 • Andy Schwart, Univ. of Pittsburgh • Jan Scheuermann, quadriplegic • Brain-machine interface, 96 electrodes • 13 weeks of training • High-performance neuroprosthetic control by an individual with tetraplegia, The Lancet, v381, p557-654, Feb 2013
Lifelong Machine Learning (LML) • Considers methods of retaining and using learned knowledge to improve the effectiveness and efficiency of future learning • We investigate systems that must learn: • From impoverished training sets • For diverse domains of tasks • Where practice of the same task happens • Applications: • Intelligent Agents, Robotics, User Modeling, DM
Supervised Machine Learning Framework Testing Examples Instance Space X After model is developed and used it is thrown away. (x, f(x)) Model of Classifier h Inductive Learning System Training Examples h(x) ~ f(x)
Domain Knowledge long-term memory Retention & Consolidation Knowledge Transfer Inductive Bias Selection Lifelong Machine Learning Framework Testing Examples Instance Space X (x, f(x)) Model of Classifier h Inductive Learning System short-term memory Training Examples h(x) ~ f(x)
Domain Knowledge long-term memory Retention & Consolidation Knowledge Transfer Inductive Bias Selection Lifelong Machine Learning Framework Testing Examples Instance Space X (x, f(x)) Model of Classifier h Inductive Learning System short-term memory Training Examples h(x) ~ f(x)
Knowledge Transfer Inductive Bias Selection f5(x) f1(x) f2(x) x1 xn Lifelong Machine Learning One Implementation Testing Examples Instance Space X … fk(x) f2(x) f3(x) f9(x) Domain Knowledge long-term memory Consolidated MTL Retention & Consolidation (x, f(x)) Model of Classifier h Training Examples Multiple Task Learning (MTL) h(x) ~ f(x)
x = weather data f(x) = flow rate An Environmental Example Stream flow rate prediction [Lisa Gaudette, 2006]
Lifelong Machine Learning with csMTL Example: • Learning to Learn how to transform images • Requires methods of efficiently & effectively • Retaining transform model knowledge • Using this knowledge to learn new transforms (Silver and Tu, 2010)
Deep Learning Architectures • Hinton and Bengio (2007+) • Learning deep architectures of neural networks • Layered networks of unsupervised auto-encoders efficiently develop hierarchies of features that capture regularities in their respective inputs • Used to develop models for families of tasks
Deep Learning Architectures • Consider the problem of trying to classify these hand-written digits.
Deep Learning Architectures 2000 top-level artificial neurons 2 1 3 500 neurons (higher level features) 0 1 2 4 3 7 5 6 8 9 500 neurons (low level features) • Neural Network: • - Trained on 40,000 examples • Learns: • * labels / recognize images • * generate images from labels • Probabilistic in nature • Demo Images of digits 0-9 (28 x 28 pixels)
ML and Computing Power • Moores Law • Expected to accelerate as the power of computers move to a log scale with use of multiple processing cores
ML and Computing Power • IBMs Watson – Jeopardy, Feb, 2011: • Massively parallel data processing system capable of competing with humans in real-time question-answer problems • 90 IBM Power-7 servers • Each with four 8-core processors • 15 TB (220M text pages) of RAM • Tasks divided into thousands of stand-alone jobs distributed among 80 teraflops (1 trillion ops/sec) • Uses a variety of AI approaches including machine learning
ML and Computing Power Andrew Ng’s work on Deep Learning Networks (ICML-2012) • Problem: Learn to recognize human faces, cats, etc from unlabeled data • Dataset of 10 million images; each image has 200x200 pixels • 9-layered locally connected neural network (1B connections) • Parallel algorithm; 1,000 machines (16,000 cores) for three days Building High-level Features Using Large Scale Unsupervised Learning Quoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. Ng ICML 2012: 29th International Conference on Machine Learning, Edinburgh, Scotland, June, 2012.
ML and Computing Power Results: • A face detector that is 81.7% accurate • Robust to translation, scaling, and rotation Further results: • 15.8% accuracy in recognizing 20,000 object categories from ImageNet • 70% relative improvement over the previous state-of-the-art.
Never-Ending Language Learner • Carlson et al (2010) • Each day: Extracts information from the web to populate a growing knowledge base of language semantics • Learns to perform this task better than on previous day • Uses a MTL approach in which a large number of different semantic functions are trained together
Cloud-Based ML - Google https://developers.google.com/prediction/
Thank You! • danny.silver@acadiau.ca • http://plato.acadiau.ca/courses/comp/dsilver/ • http://ML3.acadiau.ca