Enhancing Object Recognition with Scene Text Detection

Con-Text: Text Detection Using Background Connectivity for Fine-Grained Object Classification Sezer Karaoglu, Jan van Gemert, Theo Gevers

Can we achieve a better object recognition with the help of scene-text?

Goal • Exploit hidden details by text in the scene to improve visual classification of very similar instances. DJ SUBS Breakfast Starbucks Coffee StarbucksCoffee SKY SKY SKY CAR CAR Application : Linking images from Google street view to textual business inforation as e.g. the Yellow pages, Geo-referencing, Information retrieval

Challenges of Text Detection in Natural Scene Images • Lighting • Surface Reflections • Unknown background • Non-Planar objects • Unknown Text Font • Unknown Text Size • Blur

Literature Review Text Detection • Texture Based: Wang et al. “End-to-End Scene Text Recognition” ICCV ‘11 • Computational Complexity • Dataset specific • Do not rely on heuristic rules • Region Based: Epshtein et al. “Detecting Text in Natural Scenes with Stroke Width Transform ” CVPR ‘10 • Hard to define connectivity • Segmentation helps to improve ocr performance

Motivation to remove background for Text Detection • To reduce majority of image regions for further processes. • To reduce false positives caused by text like image regions (fences, bricks, windows, and vegetation). • To reduce dependency on text style.

Proposed Text Detection Method Text detection by BG substraction Automatic BG seed selection BG reconstruction

Background Seed Selection • Color, contrast and objectness responses are used as feature. • Random Forest classifier with 100 trees based on out-of-bag error are used to create forest. • Each tree is constructed with three random features. • The splitting of the nodes is made based on GINI criterion. Original Image Color Boosting Contrast Objectness

ConditionalDilationfor BG connectivity where B is the structring element (3 by-3 square), M is the binary image where bg seeds are ones and X is the gray level input image until repeat

Text Recognition Experiments • ICDAR’03 Dataset with 251 test images, 5370 characters, 1106 words.

ICDAR 2003 DatasetChar. Recognition Results The proposed system removes 87% of the non-text regions where on average 91% of the test set contains non-text regions. It retains approximately %98 of text regions.

ImageNet Dataset Bakery Country House Discount House Funeral Pizzeria Steak • ImageNet building and place of business dataset ( 24255 images 28 classes, largest dataset ever used for scene tekst recognition) • The images do not necessarily contain scene text. • Visual features : 4000 visual words,standard gray SIFT only. • Text features: Bag-of-bigrams , ocr resultsobtained for each image in the dataset. • 3 repeats, to compute standard deviations in Avg. Precision. • Histogram Intersection Kernel in libsvm. • Text only, Visual only and Fused results are compared.

Fine-GrainedBuildingClassificationResults ocr : 15.6 ± 0.4 Fusion Text Visual Bow + ocr : 39.0 ± 2.6 Bow : 32.9 ± 1.7 Discount House Visual #269 #431 #584 #2752 Text #1 #4 #5 #8 Proposed #1 #4 #5 #8

Conclusion • Background removal is a suitable approach for scene text detection • A new text detection method, using background connectivity and, color, contrast and objectness cues is proposed. • Improvedperformance to scene text recognition. • Improved Fine-Grained Object Classification performance with visual and scene text information fusion.

DEMO TRY HERE

Enhancing Object Recognition with Scene Text Detection

Enhancing Object Recognition with Scene Text Detection

Presentation Transcript

A Survey on Text Classification

Lesson Ten

Special Topics in Text Mining

Generative and Discriminative Models in Text Classification

Building text features for object image classification

Text Classification and Na ï ve Bayes

Text Text Text Text Text Text Text Text Text Text Text Text Text Text

Background

Building Text features for object image classification

Identifying free text plagiarism based on semantic similarity

Text Classification: An Advanced Tutorial

TEXT CLASSIFICATION

Creating Text

#999

Basic text screen

You’re Not From ‘Round Here, Are You? Naïve Bayes Detection of Non-native Utterance Text

Particle Flow Using TEXT

WP4: Conceptual Mining from Text for Knowledge Engineering

Text Classification and Named Entities for New Event Detection

AdaBoost Algorithm and its Application on Object Detection