Introduction to the TLearn Simulator

Introduction to the TLearn Simulator • CS/PY 399 Lab Presentation # 5 • February 8, 2001 • Mount Union College

TLearn Software • Developed by Cognitive Psychologists to study properties of connectionist models and learning • Kim Plunkett, Oxford • Experimental Psychologist • Jeffrey Elman, U.C. San Diego • Cognitive Psychologist • Simulates massively-parallel networks on serial computer platforms

Notational Conventions • TLearn uses a slightly different notation than that which we have been using • Input signals are treated as nodes in the network, and displayed on screen as squares • Other nodes (representing neurons) are displayed as circles • Input and output values can be any real numbers (decimals allowed)

Weight Adjustments: Learning • TLearn uses a more sophisticated rule than the simple one seen last week • Let tkp be the target (desired) output for node k on pattern p • Let okp be the actual (obtained) output for node k on pattern p

Weight Adjustments: Learning • Error for node k on pattern p (kp ) is the difference between target output and observed output, times the derivative of the activation function for node k • why? Don’t ask! (actually, this value simulates actual observed learning) • kp = (tkp - okp) · [okp · (1 - okp) ]

Weight Adjustments: Learning • This is used to calculate adjustments to weights • Let wkj be the weight on the connection from node j to node k (backwards notation is what the authors use) • Let wkj be the change required for wkj due to training • wkj is determined by: error for node k, input from node j, learning rate ()

Weight Adjustments: Learning • wkj =  · kp · ojp •  is small (< 1, usually 0.05 to 0.5), to keep weights from making wild swings that overshoot goals for all patterns • This actually makes sense . . . • a larger error (kp) should make wkj larger • if ojp is large, it contributed a great deal to the error, so it should contribute a large value to the weight adjustment

Weight Adjustments: Learning • The preceding is called the delta rule • Used in Backpropagation Training • error adjustments are propagated backwards from output layer to previous layers when weight changes are calculated • Luckily, the simulator will perform these calculations for you! • Read more in Ch. 1 of Plunkett & Elman

TLearn Simulation Basics • For each problem on which you will work, the simulator maintains a PROJECT description file • Each project consists of three text files: • .CF file: configuration information about the network’s architecture • .DATA file: input for each of the network’s training cases • .TEACH file: output for each training case

TLearn Simulation Basics • Each file must contain information in EXACTLY the format TLearn expects, or else the simulation won’t work • Example: AND project from Chapter 3 folder • 2 inputs, one outupt, output = 1 only if both inputs = 1

.DATA and .TEACH Files

.DATA File format first line: distributed or localist to start, we’ll always use distributed second line: n = # of training cases next n lines: inputs for each training case a list of v values, separated by spaces, where v = # of inputs in network

.TEACH File format first line: distributed or localist must match mode used in .DATA file second line: n = # of training cases next n lines: outputs for each training case a list of w values, separated by spaces, where w = # of outputs in network a value may be *, meaning output is ignored during training for this pattern

.CF File

.CF File format • Three sections • NODES: section • nodes = # of non-input units in network • inputs = # of inputs to network • outputs = # of output units • output node is ___ <== which node is the output node? • > 1 output node ==> syntax changes to “output nodes are”

.CF File format • CONNECTIONS: section • groups = 0 ( explained later ) • 1 from i1-i2 (says that node # 1 gets values from input nodes i1 and i2) • 1 from 0 (says that node # 1 gets values from the bias node -- explained below) • input nodes always start with i1, i2, etc. • non-input nodes start with 1, 2, etc.

.CF File format • SPECIAL: section • selected = 1 (special simulator results reporting) • weight-limit = 1.00 (range of random weight values to use in initial network creation)

Bias node • TLearn units all have same threshold • defined by logistic function •  values are represented by a bias node • connected to all non-input nodes • signal always = 1 • weight of the connection is - • same as a perceptron with a threshold • example on board

Network Arch. with Bias Node

.CF File Example (Draw it!) • NODES: • nodes = 5 • inputs = 3 • outputs = 2 • output nodes are 4-5 • CONNECTIONS: • groups = 0 • 1-3 from i1-i3 • 4-5 from 1-3 • 1-5 from 0

Learning to use TLearn • Chapter 3 of the Plunkett and Elman text is a step-by-step description of several TLearn Training sessions. • Best way to learn: Hands-on! Try Lab Exercise # 5

Introduction to the TLearn Simulator • CS/PY 399 Lab Presentation # 5 • February 8, 2001 • Mount Union College

Introduction to the TLearn Simulator