400 likes | 604 Views
A Laplacian Method for Video Text Detection. Trung Quy Phan, Palaiahnakote Shivakumara and Chew Lim Tan. Agenda . Introduction Previous Methods Laplacian Method Experimental Results Conclusion and Future Work. Agenda . Introduction Previous Methods Laplacian Method Experimental Results
E N D
A Laplacian Methodfor Video Text Detection Trung Quy Phan, Palaiahnakote Shivakumara and Chew Lim Tan
Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Conclusion and Future Work
Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Conclusion and Future Work
Introduction • Motivation: video indexing • Generates keywords from text • Able to retrieve a particular event or image • Graphic & scene text • Different from camera-based images • Low resolution • Complex background • Text movement & distortion
Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Conclusion and Future Work
Previous Methods • Connected component-based • Assumes text pixels have the same colors or grayscale intensities • Edge-based • Works well for high contrast text • Produces false positives for complex backgrounds • Texture-based • Trainable • Computationally expensive
Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Conclusion and Future Work
Laplacian Method • Step 1: Text Detection • Identifies candidate text regions • Step 2: Boundary Refinement • Refines the text block boundaries • Step 3: False Positive Elimination
Agenda • Introduction • Previous Methods • Laplacian Method • Text Detection • Boundary Refinement • False Positive Elimination • Experimental Results • Conclusion and Future Work
Text Detection • Text regions have a large number of discontinuities • Input grayscale Laplacian-filtered to detect the discontinuities in four directions
Text Detection • Text regions typically have many positive and negative peaks of large magnitudes
Text Detection • Maximum gradient difference (MGD) [1] • For each 1 × N window, MGD is the difference between the maximum and minimum values • Text regions have larger MGD values because of the peaks of large magnitudes
Text Detection • K-means clustering on the MGD map • K = 2, Euclidean distance
Agenda • Introduction • Previous Methods • Laplacian Method • Text Detection • Boundary Refinement • False Positive Elimination • Experimental Results • Conclusion and Future Work
Boundary Refinement • Binary Sobel edge map SM of the input image (only for text regions) • Horizontal and vertical projection profiles
Boundary Refinement • Horizontal • Vertical
Agenda • Introduction • Previous Methods • Laplacian Method • Text Detection • Boundary Refinement • False Positive Elimination • Experimental Results • Conclusion and Future Work
False Positive Elimination • Text block: (1) aspect_ratio ≥ T1 and (2) edge_area / area ≥ T2 • edge_area = number of edge pixels • Otherwise, false positive • T1 = 0.5 and T2 = 0.1
False Positive Elimination • 2 false positives
False Positive Elimination • 1st false positive removed due to the aspect ratio rule
False Positive Elimination • Sobel edge map • 2nd false positive removed due to the edge density rule
False Positive Elimination • Final output
Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Conclusion and Future Work
Experimental Results • 101 images: news, sports, movies, etc. • Sizes from 320 × 240 to 816 × 448 • English, Chinese and Korean text • Three implemented methods: edge-based method [1], gradient-based method [2] and uniform-colored method [3]
Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Sample Results • Performance Measures • Evaluation • Conclusion and Future Work
Sample Results • Low contrast text Input Edge-based Gradient-based Uniform-colored Proposed
Sample Results • Scene text Input (from [4]) Edge-based Gradient-based Uniform-colored Proposed
Sample Results • The proposed method fails if the contrastis too low Input Proposed Edge-based Gradient-based Uniform-colored
Different font sizes Different languages Sample Results (from [4])
Sample Results • Different window sizes • N = 5 in our experiment Input N = 5 N = 21
Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Sample Results • Performance Measures • Evaluation • Conclusion and Future Work
Performance Measures • Detection Rate (DR) • number of localized text / number of text • False Positive Rate (FPR) • number of non-text /number of localized blocks • Misdetection Rate (MDR) • number of text with missing characters / number of localized text DR = 100% FPR = 25% MDR = 67%
Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Sample Results • Performance Measures • Evaluation • Conclusion and Future Work
Evaluation • The proposed method outperforms the edge-based and gradient-based methods in all performance measures
Evaluation • Compared to the gradient-based method, the proposed method has a slightly worse MDR but a significantly higher DR
Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Conclusion and Future Work
Conclusion and Future Work • The proposed method performs well on the dataset • Gradient information candidate text regions • Edge information localized text blocks • May fail if the contrast is too low • Can be extended for non-horizontal text
References • C. Liu, C. Wang and R. Dai, “Text Detection in Images Based on Unsupervised Classification of Edge-based Features”, ICDAR 2005, pp. 610-614. • E. K. Wong and M. Chen, “A new robust algorithm for video text extraction”, Pattern Recognition 36, 2003, pp. 1397-1406. • V. Y. Mariano and R. Kasturi, “Locating Uniform-Colored Text in Video Frames”, 15th ICPR, Volume 4, 2000, pp 539-542. • X. S. Hua, W. Liu and H. J. Zhang, “Automatic Performance Evaluation for Video Text Detection”, ICDAR, 2001, pp 545-550.