220 likes | 367 Views
Content Based Image Retrieval. Romit Das · Ryan Scotka. GIS Problems. Search based on filename Verbatim match Noun replacement Potential for Abuse (Google Hack). Possible Solutions. Metadata Standards Re-index existing images Manual Classification Time Content-based Classification.
E N D
Content Based Image Retrieval Romit Das · Ryan Scotka
GIS Problems • Search based on filename • Verbatim match • Noun replacement • Potential for Abuse (Google Hack)
Possible Solutions • Metadata • Standards • Re-index existing images • Manual Classification • Time • Content-based Classification
CBIR – Training • Choose features to distinguish images. • Extract said features. • Apply statistical method to model features. • Categorize based on textual description.
Example Dimensions Color Frequencies Spatial Distribution 200 x 200 + Mostly flesh tones + Flesh tones concentrated in the center = baby
Author’s Feature Set • Feature Set (6 dimensions): • Color averages (LUV) • High-frequency energy bands • “Effectively discern local texture” • Wavelet transform on 4x4 blocks • Use HL, LH, and HH “high energy bands” • Use the LL for lower resolution analysis
Author’s Implementation • Statistical Modeling • Use machine learning to build concepts Concept = Paris Training Set =
Markov Models • Take known facts • Deduce hidden/unknown data
Markov Model Example • Given: • Queues of people, shelves, price labels, disgruntled workers • Possible Results: • Post office • Supermarket • Record Store
Markov Model Example • Given: • Queues of people, shelves, price labels, disgruntled workers, food products • Possible Results: • Post office • Supermarket • Record Store
Ninja Model Person, outdoors
Ninja Model People, ninjas, outdoor
Ninja Model People, ninjas, weapons, outdoors
Ninja Markov Model Person, outdoors People, ninjas, outdoors People, ninjas, outdoors weapons, class photo
Creating Concepts • Training Concept • Created from hand-picked images • Must choose statistically significant training size • Resulting Concept • Used in automatic cataloging of future images
People, ninjas, outdoors weapons, class photo Observations • Images are associated with multiple concepts. • Not foolproof • Example:
Advantages • Automatic categorization
Disadvantages • False positives • Concepts may require a vast amount of images • Increases training time • Dissimilar images needed for training of a concept
Future Additions • Further refinement of conflicting semantics • Weights assigned to classifications
Our Implementation • Perform classification with alternate learners (Weka)