330 likes | 356 Views
Introduction to Machine Learning Algorithms. What is Artificial Intelligence (AI)?. Design and study of computer programs that behave intelligently . Designing computer programs to make computers smarter .
E N D
What is Artificial Intelligence (AI)? • Design and study of computer programs that behave intelligently. • Designing computer programs to make computers smarter. • Study of how to make computers do things at which, at the moment, people are better.
Research Areas and Approaches Learning Algorithms Inference Mechanisms Knowledge Representation Intelligent System Architecture Research Intelligent Agents Information Retrieval Electronic Commerce Data Mining Bioinformatics Natural Language Proc. Expert Systems Artificial Intelligence Application Rationalism (Logical) Empiricism (Statistical) Connectionism (Neural) Evolutionary (Genetic) Biological (Molecular) Paradigm
Context Computer Science (AI) Cognitive Science Machine Learning Statistics Information Theory
Why Machine Learning? • Recent progress in algorithms and theory • Growing flood of online data • Computational power is available • Budding industry Three niches for machine learning • Data mining: using historical data to improve decisions • Medical records --> medical knowledge • Software applications we can’t program by hand • Autonomous driving • Speech recognition • Self-customizing programs • Newsreader that learns user interests
Learning: Definition • Definition • Learning is the improvement of performance in some environment through the acquisition of knowledge resulting from experience in that environment. the improvement of behavior through acquisition of knowledge on some performance task based on partial task experience
A Learning Problem: EnjoySport Sky Temp Humid Wind Water Forecast EnjoySports Sunny Warm Normal Strong Warm Same Yes Sunny Warm High Strong Warm Same Yes Rainy Cold High Strong Warm Change No Sunny Warm High Strong Cool Change Yes What is the general concept?
Metaphors and Methods Neurobiology Connectionist Learning Biological Evolution Heuristic Search Genetic Learning Tree / Rule Induction Statistical Inference Memory and Retrieval Probabilistic Induction Case-Based Learning
What is the Learning Problem? • Learning = improving with experience at some task • Improve over task T, • With respect to performance measure P, • Based on experience E. E.g., Learn to play checkers • T: Play checkers • P: % of games won in world tournament • E: opportunity to play against self
Machine Learning: Tasks • Supervised Learning • Estimate an unknown mapping from known input- output pairs • Learn fw from training set D={(x,y)} s.t. • Classification: y is discrete • Regression: y is continuous • Unsupervised Learning • Only input values are provided • Learn fw from D={(x)} s.t. • Compression • Clustering • Reinforcement Learning
Machine Learning: Strategies • Rote learning • Concept learning • Learning from examples • Learning by instruction • Inductive learning • Deductive learning • Explanation-based learning (EBL) • Learning by analogy • Learning by observation
Supervised Learning • Given a sequence of input/output pairs of the form <xi, yi>, where xi is a possible input and yi is the output associated with xi. • Learn a function f that accounts for the examples seen so far, f(xi) = yi for all i, and that makes a good guess for the outputs of the inputs that it has not seen.
Examples of Input-Output Pairs Inputs Task Outputs Recognition Classes that the objects belong to Descriptions of objects Actions or predictions Action Descriptions of situations Yes or No (indicating whether or not the office contains a recycling bin) Descriptions of offices (floor, prof’s office) Janitor robot problem
Unsupervised Learning • Clustering • A clustering algorithm partitions the inputs into a fixed number of subsets or clusters so that inputs in the same cluster are close to one another. • Discovery learning • The objective is to uncover new relations in the data.
Online and Batch Learning • Batch methods • Process large sets of examples all at once. • Online (incremental) methods • Process examples one at a time.
Machine Learning Algorithms • Neural Learning • Multilayer Perceptrons (MLPs) • Self-Organizing Maps (SOMs) • Evolutionary Learning • Genetic Algorithms • Probabilistic Learning • Bayesian Networks (BNs) • Other Machine Learning Methods • Decision Trees (DTs)
Neural Nets for Handwritten Digit Recognition … … … Pre-processing ? 0 1 2 3 9 0 1 2 3 9 Output units … … … Hidden units … … Input units … Training Test …
ALVINN System: Neural Network Learning to Steer an Autonomous Vehicle
Learning to Navigate a Vehicle by Observing an Human Expert (1/2) • Inputs • The images produces by a camera mounted on the vehicle • Outputs • The actions taken by the human driver to steer the vehicle or adjust its speed. • Result of learning • A function mapping images to control actions
Learning to Navigate a Vehicle by Observing an Human Expert (2/2)
Data Recorrection by a Hopfield Network corrupted input data original target data Recorrected data after 20 iterations Recorrected data after 10 iterations Fully recorrected data after 35 iterations
ANN for Face Recognition 960 x 3 x 4 network is trained on gray-level images of faces to predict whether a person is looking to their left, right, ahead, or up.
Transformation & reduction Selection & Sampling Preprocessing & Cleaning Interpretation/ Evaluation Data Mining -- -- -- -- -- -- -- -- -- Database/data warehouse Target data Cleaned data Transformed data Patterns/ model Knowledge Performance system Data Mining
Hot Water Flashing Nozzle with Evolutionary Algorithms Hans-Paul Schwefel performed the original experiments Start Hot water entering Steam and droplet at exit At throat: Mach 1 and onset of flashing
Gene C Gene B Learning algorithm Data Processed data Gene D Gene A Preprocessing Target Gene C Gene B Gene C Gene B Gene C Gene B Gene D Gene A Gene D Gene A Gene D Gene A Target Target Target Belief propagation The values of Gene C and Gene B are given. Probability for the target is computed. Bayesian Networksfor Gene Expression Analysis • Learning • Inference
Coding potential value GC Composition Length Donor Acceptor Intron vocabulary Multilayer Perceptrons for Gene Finding and Prediction bases Discrete exon score 1 score 0 sequence
Self-Organizing Maps for DNA Microarray Data Analysis Two-dimensional array of postsynaptic neurons Winning neurons Bundle of synaptic connections Input
Text Data DB DB Record Location Date Biological Information Extraction Data Classification & Field Extraction Data Analysis & Field Identification Field Property Identification & Learning Database Template Filling Information Extraction
Biomolecular Computing 011001101010001 ATGCTCGAAGCT