Overview

A Comparison Between Bayesian Networks and Generalized Linear Models in the Indoor/OutdoorScene Classification Problem

Overview • Introduce Scene Classification Problems • Motivation for Scene Classification • Kodak's JBJL Database and Features • Bayesian Networks • Brief Overview (description, inference, structure learning)‏ • Classification Results • GLM • Briefer Overview • Classification Results • Comparison and Conclusion

Problem Statement: Given a set of consumer digital images, can we use a statistical model to distinguish between indoor images and outdoor images?

Motivation • Kodak • Increase visual appeal by processing based on classification • Object Recognition • Provide context information which may give clues to scale, location, identity, etc.

Procedure • Establish ground-truth for all images • Perform feature extraction and confidence/probability mapping for features • Divide images into training and testing set • Use test images to train a model to predict ground-truth • Use the model to predict ground truth for the test set • Evaluate performance

Kodak JBJL • Consumer image database • 615 indoor and 693 outdoor images • Some images are difficult for HSV to determine whether it is indoor or outdoor • Some images have indoor and outdoor parts

Features and Probability Mapping • “Low-level” Features • Ohta-space color histogram (color information)‏ • MSAR model (texture information)‏ • “Mid-level” Features • Grass classifier • Sky classifier • K-NN Used to Extract Probs from Features • Quantized to nearest 10% (11 states for Mid-level, 3 states for Low-level)‏

Feature Probs and Classes

Stat. Model 1: Bayesian Network • Graphical Model • Variables are represented by vertices of a graph • Conditional relationships are represented by directed edges • Conditional Probability table associated with each vertex • Quantifies vertex relationships • Facilitates automated inference

Exact Inference • Model Joint Probability • Inference

Structure Learning Search Space • Space BNs • Variable-State Combination • (#States per Node) x (#Nodes)‏ • 2178 possible • Structures • Limited to DAGs • 29281

Scoring Metric • Score a structure based on how well the data models the data • We do have an expression estimate the data given the structure • Unfortunately, the data probability is difficult to estimate

The Bayes Dirichlet Likelihood Equivalent • Can compare structures 2 at a time • What is the prior on structure? • Assume all structures are equally likely • Use #edges to penalize complex networks

Challenges • Not all structures can be considered if there is only a small amount of data. • Context dilution • Can't consider cases where CPT cannot be filled in • Finding an optimal structure is NP hard

BDe Structure For I/O Classification • Greedy algorithm with BDe scoring • Naïve Bayes Model!

Result Compared to Previous • Previous Results • Our Results

Misclassified:Inferred Outdoor

Misclassified: Inferred Indoor

Generalized Linear Model • Outdoor and Indoor can be thought of a binary output • Logit kernel

Likelihood for GLM • Newton-Raphson • Get estimates of mean and variance (1st and 2nd derivative)‏ • Find optimal based on estimates (Taylor Expansion)‏ • Iterate • Generally, this quickly converges to the optimal solution

Side by Side Comparison

Misclassified: Predicted Outdoor

Misclassified: Predicted Indoor

Conclusion • The newer Bayesian Network model may perform classification slightly better than GLM • BN is more computationally intensive • Unclear if there is in fact a difference • Both models have difficulty with the same images • Better to introduce new data than to use a new model • New model give (at most) marginal improvement

References • Heckerman, D. A Tutorial on Learning with Bayesian Networks. In Learning in Graphical Models, M. Jordan, ed.. MIT Press, Cambridge, MA, 1999. • Murphy, K. A Brief Introduction to Graphical Models and Bayesian Networks, http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html(viewed 4/1/08)‏ • Lehmann, E.L. and Casella G. Theory of Point Estimation (2nd edition)‏ • Weisberg, S. Applied Linear Regression (3rd Edition)‏

Data Given Model Prob

Overview

Overview

Presentation Transcript

Overview

Overview

OVERVIEW

Overview

Overview

Overview

Overview

Overview

overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview

Overview