200 likes | 313 Views
Real-time Computer Vision with Scanning N-Tuple Grids. Simon Lucas Computer Science Dept. Outline. Background: N-Tuple Classifiers The scanning n-tuple grid Isolated Character Recognition Isolated Face Recognition Convolutional Mode OCR Real time vision demo Conclusions.
E N D
Real-time Computer Visionwith Scanning N-Tuple Grids Simon Lucas Computer Science Dept
Outline • Background: N-Tuple Classifiers • The scanning n-tuple grid • Isolated Character Recognition • Isolated Face Recognition • Convolutional Mode OCR • Real time vision demo • Conclusions
N-Tuple Classifiers • Work by randomly sampling input space • First applied to binary images • Very fast; reasonable accuracy • Scanning N-Tuple classifier (Lucas, 1995) • Applied to sequence recognition • Fast and accurate • Current work • SNT Grid • Specially developed for convolutional (sliding window) applications • Recognise patterns independent of location
SNT-Grid System Architecture Binarise (e.g. Niblack) Scanning Index (SNT-Grid) Likelihood Image Likelihood Image Integrated Likelihoods Further Processing (e.g. Dictionary or Language Model)
Simple Operation • Slide grid over image • Interpret each position as binary number
Efficient Implementation • Very simple idea • Decompose one 2-d scan • Into two 1-d scans! • Reduces time complexity • Suppose image is n x n • Window is m x m • Reduce from O(n2m2) • To O(n2) • Well worth the effort!
SNTGrid Speed on MNist • Java Implementation • Chars are 28 x 28 grey level images • Training (60,000 chars) • 8s (> 7,000 cps) • Testing (10,000 chars) • 3.8s (> 2,600 cps)
ORL Face Data • 40 subjects • 10 images from each • Using 5 for training, 5 for testing • Average around 97.5% accuracy • Competitive with other methods • Much faster!
Museum Archive Cards • Hard to read with conventional OCR
‘2’ Detector – Integrated OP(Uses Integral Array of Viola + Jones)
Real-time Demo • Very efficient • Can use it for real-time expression recognition • Or a ‘video’ joystick! • Bit like EyeToy – but potentially more sophisticated
Conclusions • Basis of simple and efficient computer vision • Trick is the scan decomposition • Also use of integral image to accumulate likelihoods • Currently being applied to reading text in natural scenes • Many other applications also • Further reading: ICDAR 2005 Paper (on my web page)