Martin Russell

EE3P BEng Final Year ProjectGroup Session 2Processing Patterns using a Multi-Layer Perceptron (MLP) Martin Russell

Objectives • To show you how to classify patterns using a type of artificial neural network • To introduce the Multi Layer Perceptron (MLP) • To describe the MLP software provided and to show you how to use it • To set some practical work to be completed before the next session

The Application • Classification of spoken vowels from their first 2 formant frequencies, F1 and F2 • Use a well-known set of data, collected by Peterson & Barney in 1952 • Called the “Peterson-Barney” data

Petersen Barney data • 76 people (33 male, 28 female, 15 children) each spoke 2 examples of 10 vowels – 1520 utterances. • For each utterance, Peterson & Barney measured: • F0: the fundamental frequency • F1, F2 and F3: the frequencies of the first 3 formants • We will just use F1 and F2

The Vowels • IY “beet” • IH “bit” • EH “bet” • AE “bat” • AA “bott” • AO “bought” • UH “book” • UW “boot” • AH “but” • ER “bird”

Peterson & Barney F1 F2 data

The Basic Idea F1 F2 e.g: F1 F2 correspond to ‘AO’ MLP IY IH EH AE AA AO UH UW AH ER 0 0 0 0 0 1 0 0 0 0

F1 F1 F2 F2 “AA” “AA” Understanding the MLP F1 F2 ? “AA”

A Multi-Layer Perceptron (MLP) Input units “hidden” units, “artificial neurones” Output units

x y 1 (constant) w2 w1 w3 h An “artificial neurone” Input to h is given by: I(h) = w1x + w2y + w3 Output from h is given by: O(h) = g(I(h)) ‘Sigmoid’ activation function

The ‘sigmoid’ activation function g Value of k

Artificial neurone (summary) • The input I(h) to the hidden unit h is the weighted sum of the outputs from the inout units above it • If I(h) > 0, O(h) ~ 1 (neurone ‘fires’) • If I(h) < 0, O(h) ~ 0 (neurone doesn’t ‘fire’)

Back to the P&B data • For the Peterson & Barney application we use: • 3 input units: • 1st for F1, • 2nd for F2, and • 3rd set equal to 1 • 10 output units, 1 for each vowel class • Variable number of hidden units

Example ‘P&B’ MLP 3 input units 10 output units

‘Winner takes all’ coding • This is the scheme for coding the outputs, or ‘targets’ for the MLP • If F1, F2 correspond to IY, target is (1,0,0,0,0,0,0,0,0,0) • If F1, F2 correspond to IH, target is (0,1,0,0,0,0,0,0,0,0) • If F1, F2 correspond to EH, target is (0,0,1,0,0,0,0,0,0,0) • …. • If F1, F2 correspond to ER, target is (0,0,0,0,0,0,0,0,0,1)

“Black Art”: input scaling • With MLPs it is normal to scale each input so that the average is 1 and variance is 1 • For the Peterson & Barney F1 and F2 data I have: • Replaced F1 with (F1 – av(F1))/sd(F1) • Replaced F2 with (F2 – av(F2))/sd(F2) • av = average, sd = standard deviation

Data file format <NUM_DATA_LINES> 200 <NUM_IP> 3 <NUM_OP> 10 0.560000 1.719298 1 1 0 0 0 0 0 0 0 0 0 0.576000 1.754386 1 1 0 0 0 0 0 0 0 0 0 0.800000 1.459649 1 0 1 0 0 0 0 0 0 0 0 0.768000 1.480702 1 0 1 0 0 0 0 0 0 0 0 1.180000 1.333333 1 0 0 1 0 0 0 0 0 0 0 1.166000 1.291228 1 0 0 1 0 0 0 0 0 0 0 1.360000 1.298246 1 0 0 0 1 0 0 0 0 0 0 1.370000 1.249123 1 0 0 0 1 0 0 0 0 0 0 1.320000 0.961403 1 0 0 0 0 1 0 0 0 0 0 . . ‘header’ information Input values (F1 F2 1) Target values

Data Files • There are two data files • pbData_m_f1f2_norm_train(the training set) • pbData_m_f1f2_norm_test(the test set) • Both are on the web site

How to build an MLP • First decide how many hidden units • Then choose initial estimates for each of the MLP parameters. Often these are chosen at random UseMLPInit.cfrom website MLPInit.c sets the weights randomly

Using MLPInit MLPInit 3 5 10 init_3x5x10 Creates an MLP with: • 3 input units • 5 hidden units • 10 output units • Result is stored in the file init_3x5x10

Example output from MLPInit <NUM_INPUT> 3 <NUM_HIDDEN> 5 <NUM_OUTPUT> 10 0.416415 0.418549 0.242699 0.333669 0.199291 0.195465 0.477893 0.556194 0.287547 0.280483 0.388121 0.103558 0.201107 0.378784 0.520226 0.303600 0.012010 0.215942 0.176979 0.171643 0.183424 0.113759 0.223094 0.365760 0.073800 0.210656 0.178952 0.069903 0.289522 0.161591 0.268101 0.260700 0.036500 0.067274 0.300626 0.237691 0.101108 0.058603 0.211381 0.318471 0.139368 0.296624 0.081525 0.020991 0.279433 0.046923 0.592494 0.538003 0.102141 0.095765 0.310450 0.022514 0.281300 0.479829 0.263548 0.201131 0.115437 0.117549 0.219977 0.252530 0.098656 0.306403 0.377581 0.066146 0.082592 See website

MLP Training • Suppose you have: • Input values 1.212 3.127 1, and • Target values 0 0 0 0 1 0 0 0 0 0 • If you apply 1.212 3.127 1 to the input of your MLP, then you would like output to be target • The (squared) difference between the target output and actual output is called the Error • By adjusting the MLP weights this error can be reduced • The algorithm for reducing this error is called Error Back-Propagation

MLPTrain.c • MLPTrain.c implements the Error Back-Propagation training algorithm • It’s available on the website

How to use MLPTrain.c • Type: • MLPTrain initMLP dataFile numIt trainedMLP • initMLP is the ‘initial’ MLP (e.g. ini_3x5x10) • dataFile is the training data file (e.g. pbData_m_f1f2_train) • numItis the number of iterations • trainedMLP is the file for the optimised MLP • Example: • MLPTrain ini_3x5x10 pbData_m_f1f2_train 10000 mlp_3x5x10

Example MLPTrain ini_3x5x10 pbData_m_f1f2_train 50000 mlp_50000_3x5x10

Testing the MLP • Use the test set pbData_m_f1f2_norm_test • Consists of 200 test F1 F2 patterns • Want to input each test pattern and see if the MLP gives the ‘correct’ output pattern • The output is correct if the output unit with the biggest value corresponds to the ‘1’ in the target

Testing the MLP • Example 1: • Output = 0.1 0.85 0.2 0.0 0.0 0.12 0.0 0.0 0.0 0.0 • Target= 0 1 0 0 0 0 0 0 0 0 • Correct • Example 2: • Output = 0.1 0.85 0.2 0.0 0.0 0.12 0.0 0.0 0.0 0.0 • Target= 0 0 1 0 0 0 0 0 0 0 • Error

MLPClass.c • To test your MLP use MLPClass.c from the website • MLPClass mlpFile testData • Where: • mlpFile is the trained MLP file • testData is the test data file • Example: MLPClass mlp_3x5x10 pbData_m_f1f2_test

Note on the C programs • All of the C programs are ANSI C compatible • They were developed under linux, but that doesn’t matter • They should compile on any platform • I recommend the command-line C compiler under MS visual studio .net

More “black art” • The programs MLPTrain and MLPClass need a set of parameters • These are stored in a file called mlpconfig (see website) • This file must be in your directory when you run MLPTrain or MLPClass

mlpconfig <KSCALE> 1.0 <LRATE> 1.0 <MOMENTUM> 0.0 <TRMODE> BATCH This file is on the website Options for <TRMODE> are BATCH or SEQUENTIAL <KSCALE> is just k from slides 10 and 11

Tasks • Download C programs from the website • Compile them • Download Peterson & Barney data from the website • Follow the steps outlined in these slides to build and test some MLPs • How many hidden units do you need? • How many iterations of training are needed? • What recognition rate can you achieve? • What the ‘learning rate’ <LRATE> and ‘momentum’ <MOMENTUM> and how do they affect the results? • Find out more about how an MLP works

Martin Russell

Martin Russell

Presentation Transcript

Russell Freedman

Russell McBurrows

Betrand Russell

Cazzie Russell

Russell Dennis

Russell

Russell Chan Lockheed Martin Corporate Engineering & Technology

Hertzsprung-Russell

presented by Lou Russell Russell Martin & Associates

Russell Davey

EE3J2 Data Mining Lecture 10 Statistical Modelling Martin Russell

EE3J2 Data Mining Lecture 8 Latent Semantic Indexing Martin Russell

EE3J2 Data Mining Lecture 5 More on the Index Martin Russell

Willy Russell

Russell Simmons

EE3J2 Data Mining Lecture 15: Hidden Markov Models Martin Russell

EE3J2 Data Mining Lecture 7 Query Expansion Martin Russell

Russell Prevost

Russell Sergent

Bertrand Russell

BERTRAND RUSSELL

Martin Russell

Martin Russell

Presentation Transcript

Russell Freedman

Russell McBurrows

Betrand Russell

Cazzie Russell

Russell Dennis

Russell

Russell Chan Lockheed Martin Corporate Engineering &amp; Technology

Hertzsprung-Russell

presented by Lou Russell Russell Martin &amp; Associates

Russell Davey

EE3J2 Data Mining Lecture 10 Statistical Modelling Martin Russell

EE3J2 Data Mining Lecture 8 Latent Semantic Indexing Martin Russell

EE3J2 Data Mining Lecture 5 More on the Index Martin Russell

Willy Russell

Russell Simmons

EE3J2 Data Mining Lecture 15: Hidden Markov Models Martin Russell

EE3J2 Data Mining Lecture 7 Query Expansion Martin Russell

Russell Prevost

Russell Sergent

Bertrand Russell

BERTRAND RUSSELL

Russell Chan Lockheed Martin Corporate Engineering & Technology

presented by Lou Russell Russell Martin & Associates