Introduction

Introduction Machine Learning: Chapter 1

Contents • Types of learning • Applications of machine learning • Disciplines related with machine learning • Well-posed learning problems • Designing a learning system • Choosing the training experience • Choosing the target function • Choosing a representation for the target function • Choosing a function approximation algorithm • The final design • Issues in machine learning • Summary

Types of Learning • Supervised Learning: • Given training data comprising examples of input vectors along with their corresponding target vectors, goal is either (1) to assign each input vector to one of a finite number of discrete categories (classification) or (2) to assign each input vector to one or more continuous variables (regression). • Unsupervised Learning: • Given training data consists of a set of input vector without any corresponding target values, goal is to either (1) to discover groups of similar examples within data, called clustering, or (2) to determine the distribution of data within the input space, known as density estimation, or (3) to project the data from a high-dimensional space down to two or three dimensions for the purpose of visualization. • Reinforcement Learning: • Given an agent having a set of sensors to observe the state of its environment and a set of actions it can performs to alter this state, goal is to find suitable actions to take in a given situation in order to maximize the accumulated reward where each action resulting in certain state is given a reward (exploration and exploitation).

Applications of machine learning • Recognizing spoken words: • primitive sounds (phonemes), words from observed speech signal • Use of neural networks, hidden Markov models for customizing to individual speakers • Applications in signal-interpretation problems • Driving an autonomous vehicle: • Steering for driving on a variety of road types • Applications in sensor based control problems • Classifying new astronomical structures • Use of decision tree algorithms to automatically classify objects in sky survey • Classifying variety of large databases to learn general regularities • Playing world-class backgammon • Learning strategy by self playing against itself over a million times and becoming competitive with human world champion • Applications to the problems with very large search spaces

Backgammon game rules • Middle bar: When you have a hoarse in the middle bar, it should be moved first to the opponents land • Capture: when you move to the place where only one opponent horse, you are capturing it (you are not allowed to go to the place where there are more then 2 opponent horses) • Removal: when you have all of your horses in your side, your horses can be removed.

Checkers game rules • Players who cannot move loses the game • Moving: piece (forward), king (forward or backward) • Jumping: you can jump to capture the opponents piece. piece (forward), king (forward or backward) • Kinging: when a piece moves to the last row, it becomes king. The captured piece is placed on top of to make it look bigger. (A piece that has just kinged, cannot continue jumping pieces, until the next move)

Disciplines related with machine learning • Artificial intelligence: • Learning symbolic representations of concepts • An approach to improving problem solving • Bayesian methods: • Bayes’ theorem as the basis for calculating probabilities of hypotheses • Computational complexity theory • Bounds on the inherent complexity of different learning tasks • Measured by number of training examples, computational effort, number of mistakes in order to learn • Control theory: • Control processes to optimize predefined objectives • Predict next state of the process

Disciplines related with machine learning • Information theory: • Measures of entropy and information content • Minimum description length approaches to learning • Philosophy: • Occam’s razor: Simplest possible hypothesis is the best for the given data • Justification (heuristic rule) for generalizing beyond observed data • Psychology and neurobiology: • Power law of practice for peoples response time in learning problems • Neurobiological studies motivating ANNs • Statistics: • Characterization of errors (e.g. bias and variance) for estimating the accuracy of a hypothesis based on a limited sample of data • Statistical tests

Well-Posed Learning Problems • Definition: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. • Three features in learning problems • The class of tasks (T) • The measure of performance to be improved (P) • The source of experience (E)

Well-Posed Learning Problems : Examples • A checkers learning problem • Task T : playing checkers • Performance measure P : percent of games won against opponents • Training experience E : playing practice games against itself • A handwriting recognition learning problem • Task T : recognizing and classifying handwritten words within images • Performance measure P : percent of words correctly classified • Training experience E : a database of handwritten words with given classifications

Well-Posed Learning Problems :Examples • A robot driving learning problem • Task T : driving on public four-lane highways using vision sensors • Performance measure P : average distance traveled before an error (as judged by human overseer) • Training experience E : a sequence of images and steering commands recorded while observing a human driver

Designing a Learning System • Choosing the Training Experience • Choosing the Target Function • Choosing a Representation for the Target Function • Choosing a Function Approximation Algorithm • The Final Design

Choosing the Training Experience • Whether the training experience provides direct or indirect feedback regarding the choices made by the performance system: • Example: • Direct training examples in learning to play checkers consist of individual checkers board states and the correct move for each. • Indirect training examples in the same game consist of the move sequences and final outcomes of various games played in which information about the correctness of specific moves early in the game must be inferred indirectly from the fact that the game was eventually won or lost – credit assignment problem.

Choosing the Training Experience • The degree to which the learner controls the sequence of training examples: • Example: • The learner might rely on the teacher to select informative board states and to provide the correct move for each • The learner might itself propose board states that it finds particularly confusing and ask the teacher for the correct move. • The learner may have complete control over the board states and (indirect) classifications, as it does when it learns by playing against itself with no teacher present.

Choosing the Training Experience • How well it represents the distribution of examples over which the final system performance P must be measured: In general learning is most reliable when the training examples follow a distribution similar to that of future test examples. • Example: • If the training experience in play checkers consists only of games played against itself, the learner might never encounter certain crucial board states that are very likely to be played by the human checkers champion. (Note however that the most current theory of machine learning rests on the crucial assumption that the distribution of training examples is identical to the distribution of test examples)

A checkers learning problem • Task T: playing ckeckers • Performance measure P: percent of games won in the world tournament • Training experience E: games played against itself • Choose: • The exact type of knowledge to be learned (target function) • A representation for this target knowledge • A learning mechanism (function approximation algorithm)

Choosing the Target Function • To determine what type of knowledge will be learned and how this will be used by the performance program: • Example: • In play checkers, it needs to learn to choose the best move among those legal moves: ChooseMove: B  M, which accepts as input any board from the set of legal board states B and produces as output some move from the set of legal moves M.

Choosing the Target Function • Since the target function such as ChooseMove turns out to be very difficult to learn given the kind of indirect training experience available to the system, an alternative target function is then an evaluation function that assigns a numerical score to any given board state, V: B  R. • (non-operational) Definition of target function: • If b is a final board state that is won, then V(b) = 100 • If b is a final board state that is lost, then V(b) = -100 • If b is a final board state that is drawn (even), then V(b) = 0 • If b is not a final board state in the game, then V(b) = V(b’) for the optimal state b’ obtained from b • Operational description of V needs function approximation

Choosing a Representation for the Target Function • Given the ideal target function V, we choose a representation that the learning system will use to describe V’ that it will learn: • Describing the function: • Tables • Rules • Polynomial functions • Neural nets • Trade-off in choice • Expressive power • Size of training data

Choosing a Representation for the Target Function • Target function: V: B  R • Target function representation (linear combination of board features): • V’(b) = w0 + w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6 • where xis are the number of black/red pieces/kings/queens on/threatened on the board • The effect of this design choices is to reduce the problem of learning a checker’s strategy to the problem of learning values for the coefficients w0 to w6 in the target function representation

Choosing a Function Approximation Algorithm • Each training example is given by <b, Vtrain(b)> where Vtrain(b) is the training value for a board b. • Example: “black has won the game (x2 = 0) <<x1=3, x2=0, x3=1, x4=0, x5=0, x6=0>, +100> • Estimating Training Values: • Ambiguity in estimating training values: with only final results, need to assign specific scores to specific board states • Effective approach: using current approximation of V and next state Vtrain(b)  V’(Successor(b))

Choosing a Function Approximation Algorithm • Adjusting the Weights • To specify the learning algorithm for choosing the weights wi to best fit the set of training examples {<b, Vtrain(b)>} • minimizing the squared error E between the training values and the values predicted by the hypothesis V’ E = observed training examples(Vtraining(b) – V’(b))2 • LMS weight update rule • When error is positive, V’ is low and weight is increased to raise V’ For each training example <b, Vtrain(b)> Use the current weights to calculate V’(b) For each weight wi , update it as wi wi +  (Vtrain(b) – V’(b)) xi

The Final Design • 4 program modules: Performance system, Critic, Generalizer, Experiment generator • Performance System: To solve the given performance task (playing checkers) by using the learned target function(s). It takes an instance of a new problem (new game) as input and a trace of its solution (game history) as output. • Critic: To take as input the history or trace of the game and produce as output a set of training examples of the target function.

The Final Design • Generalizer: To take as input the training examples and produce an output hypothesis that is its estimate of the target function. It generalizes from the specific training examples, hypothesizing a general function that covers these examples and other cases beyond the training examples.

The Final Design • Experiment Generator: To take as input the current hypothesis (currently learned function) and outputs a new problem (i.e., initial board state) for Performance System to explore. Its role is to pick new practice problems that will maximize the learning rate of the overall system.

The Final Design Experiment Generator Hypothesis (V’) New problem (initial game board) Performance System Generalizer Solution trace (game history) Training examples Critic { <b1,Vtrain(b1)> , <b2,Vtrain(b2)> …} Figure 1.1 Final design of the checkers learning program

Choices in designing the checkers learning problem Determine Type of Training Experience Games against experts Table of correct moves Games against self … Determine Target Function Board  value Board  move … Determine Representation of Learned Function … Linear function of six features Artificial neural network Polynomial Determine Learning Algorithm Linear programming Gradient descent … Completed Design

Issues in Machine Learning • What algorithms exist for learning general target functions from specific training examples ? • Convergence settings under sufficient data • Performance with respect to the types of problems and representations • How much training data is sufficient? • General bounds to relate the confidence in learned hypothesis to the amount of training experience, the character of learner’s hypothesis space • When and how can prior knowledge held by the learner guide the process of generalizing from examples ? • Can approximately correct prior knowledge be helpful ?

Issues in Machine Learning • What is the best strategy for choosing a useful next training experience ? • And how does the choice of this strategy alter the complexity of the learning problem ? • What is the best way to reduce the learning task to one or more function approximation problems ? • What specific functions should be learned ? • Can this process be automated ? • How can the learner automatically alter its representation to improve its ability to represent and learn the target function ?

Summary • Usefulness of machine learning algorithms: • Data mining of large data • Poorly understood domains (e.g. Face recognition) • Domains with dynamically changing conditions (e.g. Manufacturing process) • Ideas from diverse principles • Artificial intelligence • Probability and statistics • Computational complexity • Information theory • Psychology and neurobiology • Control theory • philosophy • Well defined learning problem requires a well specified task, performance measure, training experience

Summary • Designing a machine learning involves choosing • the type of training experience • the target function • representation of this target function • algorithm for learning this target function • Learning involves search through a space of possible hypothesis to find the hypothesis that fits best the available training examples and other prior knowledge

Introduction

Introduction

Presentation Transcript

Introduction to introduction to introduction to … Optimization

INTRODUCTION/ INTRODUCTION

Introduction

INTRODUCTION

Introduction

Introduction