160 likes | 352 Views
Multiple Approaches at Hand Written Digit Recognition. Luis Bathen Mike Munson Jeremy Smith. Problem and Motivation. Problem: Given picture data representing a handwritten digit, determine which digit (0-9) it is Error rate must be extremely low for practical use
E N D
Multiple Approaches at Hand Written Digit Recognition Luis Bathen Mike Munson Jeremy Smith
Problem and Motivation • Problem: • Given picture data representing a handwritten digit, determine which digit (0-9) it is • Error rate must be extremely low for practical use • If the digit is too poorly drawn or cannot be classified, it should be rejected rather than risk improper classification • Motivation • Primary application: Postal Mail (Automatic sorting of mail by destination ZIP code) • Other applications: Digitizing handwritten spreadsheets, tax forms, etc.
Alvarez, Roderiguez, & Hermidia Kirsch Gradients Network Structure
Scaling the Image • The algorithm adds the values of 4 pixels to get the new pixel's value. • Resolution is halved, depth is doubled. • This makes less nodes to train, and gives invariance to subtle differences.
Edge Detection • Output only certain edges from an image (vertical, horizontal, left-diagonal, right-diagonal) to new images. • Using the Kirsch algorithm • Goal is to create separate networks, trained over specific features of the image (one network for each edge map)
A0 A1 A2 A3 (i,j) A4 A7 A6 A5 The Image • To extract locality from the image we used the Kirsch detector algorithm. • G(i, j) = max{1, max[|5Sk- 3tk|]} • Where G(i, j) is the gradient for pixel (i, j) and K=0..7 • Sk=Ak + Ak+1+ Ak+2 • Tk = Ak + Ak+1 + Ak+2 + Ak+3 + Ak+4
Approaches • Five subnet-multi-layered network (10 Class output) • Raw 32x32/16x16/8x8 Binary Images (10 class output) • Raw 3-bit value images 4x4 (10 Class output) • 4x4 Binary/3-bit (Binary output) • Simple image correlation (For Comparison)
32x32/16x16/8x8 and10 class output Winning Number is: 8 0.45 0.32 0.02 0.12E-23 0.99 0.12E-125 10 outputs M hidden N inputs
Activate Output Layer ai(t + 1) = f(neti) f(neti) = 1 / ( 1 + e -neti ) neti =S wij * ai(t) - bi Activate Hidden Layer Activate Inputs Net Activation
0 0 1 0 0 0 Set Expected Output Calculate Out Delta Calculate Hidden Delta (1 – aj)ajSdkwkj if j is a hidden unit k dj= Update the Weights and Biases dj= (1 – aj)aj(tj - aj) if j is an output unit Dwij = hdiaj Dbi = -hdi Back Propagation
Rejection • It is better to reject a digit than to classify it incorrectly. • To avoid rejection, the results must meet criteria • Highest confidence value must be greater than 87.5% • This value must be at least 20% more than he second-highest value
Trials & Results • Edge Detection: total failure • Edge maps looked nice, but weren't useful • Bad news: The networks trained over the edge maps were horribly innacurate • Good news: The network trained over the simple scaled image proved nearly as accurate
More Trials & Results • Different scaled image sizes • 16x16 • 8x8 • 4x4 • Goal: find best performance, possibly by taking a vote from more than one network. • Results: the 16x16 and 8x8 networks are too large/slow to effectively train and use. The 4x4 is roughly >90% accurate with <15% rejection.
References • [1] T. Bruel. “Segmentation of Handprinted Letter Strings using a Dynamic Programming Algorithm.” Presented at the Sixth International Conference on Document Analysis and Recognition (ICDAR ’01), September 1991. • [2] L. Fontaine, L. Shastri. “Handprinted Digit Recognition Using Spatiotemporal Connectionist Models.” Technical Report MS-CIS-92-24, University of Pennsylvania, March 1992. • [3] D. C. Alvarez, F.M. Rodriguez, X.F. Hermida. “Printed and Handwritten Digits Recognition Using Neural Networks.” Original publication source unkown; paper available at http://wgpi.tsc.uvigo.es/pub/papers/icsp98_1.pdf • [4] Y. Le Cun. “A Theoretical Framework for Back-Propagation.” From proceedings of 1998 Connectionist Models Summer School, 21-28. • [5] Y. Le Cun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel. “Handwritten Digit Recognition with a Back-Propagation Neural Network.” Advances in Neural Information Processing Systems, Vol. 2, 598-605. Morgan Kaufmann, 1990. • [6] O. Matan, C. J. C. Burges, Y. Le Cun, J. S. Denker. “Multi-Digit Recognition Using A Space Displacement Neural Network.” Advances in Neural Information Processing Systems, Vol. 4, 488-495. Morgan Kaufmann, 1992. • [7] G. Velasquez, ``A Distributed Approach to a Neural Network Simulation Program'.' Master's thesis, The University of Texas at El Paso, El Paso, TX, 1998