A Laplacian Method for Video Text Detection

A Laplacian Methodfor Video Text Detection Trung Quy Phan, Palaiahnakote Shivakumara and Chew Lim Tan

Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Conclusion and Future Work

Introduction • Motivation: video indexing • Generates keywords from text • Able to retrieve a particular event or image • Graphic & scene text • Different from camera-based images • Low resolution • Complex background • Text movement & distortion

Previous Methods • Connected component-based • Assumes text pixels have the same colors or grayscale intensities • Edge-based • Works well for high contrast text • Produces false positives for complex backgrounds • Texture-based • Trainable • Computationally expensive

Laplacian Method • Step 1: Text Detection • Identifies candidate text regions • Step 2: Boundary Refinement • Refines the text block boundaries • Step 3: False Positive Elimination

Agenda • Introduction • Previous Methods • Laplacian Method • Text Detection • Boundary Refinement • False Positive Elimination • Experimental Results • Conclusion and Future Work

Text Detection • Text regions have a large number of discontinuities • Input  grayscale  Laplacian-filtered to detect the discontinuities in four directions

Text Detection • Text regions typically have many positive and negative peaks of large magnitudes

Text Detection • Maximum gradient difference (MGD) [1] • For each 1 × N window, MGD is the difference between the maximum and minimum values • Text regions have larger MGD values because of the peaks of large magnitudes

Text Detection • K-means clustering on the MGD map • K = 2, Euclidean distance

Boundary Refinement • Binary Sobel edge map SM of the input image (only for text regions) • Horizontal and vertical projection profiles

Boundary Refinement • Horizontal • Vertical

False Positive Elimination • Text block: (1) aspect_ratio ≥ T1 and (2) edge_area / area ≥ T2 • edge_area = number of edge pixels • Otherwise, false positive • T1 = 0.5 and T2 = 0.1

False Positive Elimination • 2 false positives

False Positive Elimination • 1st false positive removed due to the aspect ratio rule

False Positive Elimination • Sobel edge map • 2nd false positive removed due to the edge density rule

False Positive Elimination • Final output

Experimental Results • 101 images: news, sports, movies, etc. • Sizes from 320 × 240 to 816 × 448 • English, Chinese and Korean text • Three implemented methods: edge-based method [1], gradient-based method [2] and uniform-colored method [3]

Agenda • Introduction • Previous Methods • Laplacian Method • Experimental Results • Sample Results • Performance Measures • Evaluation • Conclusion and Future Work

Sample Results • Low contrast text Input Edge-based Gradient-based Uniform-colored Proposed

Sample Results • Scene text Input (from [4]) Edge-based Gradient-based Uniform-colored Proposed

Sample Results • The proposed method fails if the contrastis too low Input Proposed Edge-based Gradient-based Uniform-colored

Different font sizes Different languages Sample Results (from [4])

Sample Results • Different window sizes • N = 5 in our experiment Input N = 5 N = 21

Performance Measures • Detection Rate (DR) • number of localized text / number of text • False Positive Rate (FPR) • number of non-text /number of localized blocks • Misdetection Rate (MDR) • number of text with missing characters / number of localized text DR = 100% FPR = 25% MDR = 67%

Evaluation

Evaluation • The proposed method outperforms the edge-based and gradient-based methods in all performance measures

Evaluation • Compared to the gradient-based method, the proposed method has a slightly worse MDR but a significantly higher DR

Conclusion and Future Work • The proposed method performs well on the dataset • Gradient information  candidate text regions • Edge information  localized text blocks • May fail if the contrast is too low • Can be extended for non-horizontal text

References • C. Liu, C. Wang and R. Dai, “Text Detection in Images Based on Unsupervised Classification of Edge-based Features”, ICDAR 2005, pp. 610-614. • E. K. Wong and M. Chen, “A new robust algorithm for video text extraction”, Pattern Recognition 36, 2003, pp. 1397-1406. • V. Y. Mariano and R. Kasturi, “Locating Uniform-Colored Text in Video Frames”, 15th ICPR, Volume 4, 2000, pp 539-542. • X. S. Hua, W. Liu and H. J. Zhang, “Automatic Performance Evaluation for Video Text Detection”, ICDAR, 2001, pp 545-550.

Thank You

A Laplacian Method for Video Text Detection

A Laplacian Method for Video Text Detection

Presentation Transcript

A New Method for Tamper Detection and Recovery

Video Shot Detection

Noise Based Detection Method for the ANSS

A High Performance Semi-Supervised Learning Method for Text Chunking

Method for Detection and Genotyping

New Method for Ship Detection

Using Webcast Text for Semantic Event Detection in Broadcast Sports Video

A New Approach for Overlay Text Detection and Extraction From Complex Video Scene

A Robust Scene-Change Detection Method for Video Segmentation

A method for WSD on Unrestricted Text

Video Pupillary Tracking Systems as a Method for Interfacing

A Text Filtering Method For Digital Libraries

Video and Text Conferencing

A New Approach for Video Text Detection and Localization

A Very Fast Method for Clustering Big Text Datasets

Video Detection Sensor for Vehicle Presence

A Dynamic Hierarchical Clustering Method for Trajectory-Based Unusual Video Event Detection

Naïve Bayes for Text Classification: Spam Detection

Laplacian Surface Editing