210 likes | 413 Views
Divya Raj, Rashmi Mishra, Kavita Sheth , Anusha Buchireddygari , Poonam Ekh e l i kar. Writer identification in offline handwriting UCML 2013. Machine . Overview. Problem definition Real time application Classes of handwriting recognition Data and representation Features
E N D
Divya Raj, Rashmi Mishra, KavitaSheth, AnushaBuchireddygari, PoonamEkhelikar Writer identification in offline handwritingUCML 2013
Machine Overview • Problem definition • Real time application • Classes of handwriting recognition • Data and representation • Features • Methodology/Algorithms • KNN • SVM • Adaboost • Results • Questions?
Problem definition Identifying the identity of the writer of a sample handwritten document, from a set of writers.
Problem definition Handwriting varies on factors such as age, handedness, mood, speed of writing, mode of writing. Intra- writer variation: Variation in handwriting of the same individual. Inter- writer variation: Variation in handwriting of two different people.
Classes of Handwriting Recognition: • Offline -data is extracted from scanned images from sources such as paper or photograph - Preprocessing is performed to get features • Online • Text is written on a digitizer • Information such as speed of writing, direction of brush stroke available
Data and representation Source : IAM Handwriting Database Data : 25 writers with 52 instances each Binary classification therefore, samples of any 2 writers are used for training at a time. Feature vector : 104 x 3
Features Our data is segmented and preprocessed • Features : 5 features • Aspect ratio(width,height) • Gray scale • Centroid of word • Number of connected components
Algorithms KNN
AdaBoost AdaBoost (Boosting algorithm)
Results KNN: 25 writers 52 instances Features used: Centroid {x,y coordinates} At a time two writers Training on 80% of instances Test accuracy 66% Training on 75% of instances Test accuracy 74% Training on 70% of instances Test accuracy 81% Features used:Aspect ratio Training on 75% of instances Test accuracy 70%
Results SVM classifier: 25 writers 52 instances Features used: Centriod {x,y coordinates} 20 iterations Training on 80% of instances accuracy 66.67% Training on 75% of instances accuracy 75% Features used: Aspect ratio {width,height} Training on 75% of instances accuracy 75%
Results Adaboost: Features used : Aspect ratio {width,height} for 20 iterations error is approx 22% for test data so 78% accuracy
References 1) R. Plamondon, S. N. Srihari, On-line and off-line handwriting recognition: A comprehensive survey",IEEE Trans. on Patt. Anal. & Mach. Intell., vol. 22(1),pp. 63-84, 2000. 2) Y. Freund and R. E. Schapire,Experiments with a new boosting algorithm", Proc. of the 13th Int.Conf. in Machine Learning, pp. 148-156, 1996. 3) Recognition of Offline Handwritten Numerals Using an Ensemble of MLPs Combined by Adaboost by Tarun Jindal,Ujjwal Bhattacharya 4) SVM library 5) Adaboost library : http://www.mathworks.com/matlabcentral/fileexchange/21317-adaboost 6) KNN library 7) F. Camastra, SVM-based cursive character recognizer", Patt. Recog., vol. 40, pp. 3721-3727, 2007.
THANK YOU Questions?