Inferring Semantic Concepts from Community- Contributed Images and Noisy Tags

Inferring Semantic Concepts from Community- Contributed Images and Noisy Tags Jinhui Tang†, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign

Outline • Motivation • Sparse-Graph based Semi-supervised Learning • Handling of Noisy Tags • Inferring Concepts in Semantic Concept Space • Experiments • Summarization and Future Work

Web Images and Metadata

Our task No manual annotation are required.

Methods Can be Used • With models: • SVM • GMM • … • Infer labels directly: • k-NN • Graph-based semi-supervised methods

Normal Graph-based Methods • A common disadvantage: • Have certain parameters that require manual tuning • Performance is sensitive to parameter tuning • The graphs are constructed based on visual distance • Many links between samples with unrelated-concepts • The label information will be propagated incorrectly. • Locally linear reconstruction: • Still needs to select neighbors based on visual distance

Key Ideas of Our Approach • Sparse Graph based Learning • Noisy Tag Handling • Inferring Concepts in the Concept Space

Why Sparse Graph ? • Human vision system seeks a sparse representation for the incoming image using a few visual words in a feature vocabulary. (Neural Science) • Advantages: • Reducethe concept-unrelated links to avoid the propagation of incorrect information; • Practical for large-scale applications, since the sparse representation can reduce the storage requirement and is feasible for large-scale numerical computation.

Normal Graph v.s. Sparse Graph Normal Graph Construction. Sparse Graph Construction.

Sparse Graph Construction • The ℓ1-norm based linear reconstruction error minimization can naturally lead to a sparse representation for the images *. • The sparse reconstruction can be obtained by solving the following convex optimization problem: minw||w||1 , s.t.x=Dw w ∈ Rn : the vector of the reconstruction coefficients; x∈ Rd : feature vector of the image to be reconstructed; D∈ Rd*n (d < n) : a matrix formed by the feature vectors of the other images in the dataset. * J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 31(2):210–227, Feb. 2009

Sparse Graph Construction (cont.) • Handle the noise on certain elements of x: • Reformulate x = Dw+ξ, where ξ ∈ Rd is the noise term. • Then : • Set the edge weight of the sparse graph:

Semi-supervised Inference • Result:

Semi-supervised Inference (cont.) • The problem with : • Muu is typically very large for image annotation • It is often computationally prohibitive to calculate its inverse directly • Iterative solution with non-negative constraints: • may not be reasonable since some samples may have negative contributions to the other samples • Solution: • Reformulate: • The generalized minimum residual method (usually abbreviated as GMRES) can be used to iteratively solve this large-scale sparse system of linear equations effectively and efficiently.

Different Types of Tags √: correct; ?: ambiguous; m: missing

Handling of Noisy Tags • We cannot assume that the training tags are fixed during the inference process. • The noisy training tags should be refined during the label inference. • Solution: adding two regularization terms into the inferring framework to handle the noise:

Handling of Noisy Tags (cont.) • Solution: • Set the original label vector as the initial estimation of ideal label vector, that is, set , and then solve and we can obtain a refined fl. • Fix fl and solve • Use the obtained to replace the y in the previous graph-based method, and we can solve the sparse system of linear equations to infer the labels of the unlabeled samples.

Why Concept Space? • It is well-known that inferring concepts based on low-level visual features cannot work very well due to the semantic gap. • To bridge this semantic gap • Construct a concept space and then infer the semantic concepts in this space. • The semantic relations among different concepts are inherently embedded in this space to help the concept inference.

The requirements for the concept space • Low-semantic-gap: Concepts in the constructed space should have small semantic gaps; • Informative: These concepts can cover the semantic space spanned by all useful concepts (tags), that is, the concept space should be informative; • Compact: The set including all the concepts forming the space should be compact (i.e., the dimension of the concept space is small).

Concept Space Construction • Basic terms: • Ω : the set of all concepts; • Θ : the constructed concept set. • Three measures: • Semantic Modelability: SM(Θ) • Coverage of Semantic Concept Space: CE(Θ, Ω) • Compactness: CP(Θ)=1/#(Θ) • Objective:

Solution for Concept Space Construction • Simplification: fix the size of the concept space. • Then we can transform this maximization to a standard quadratic programming problem. • See the paper for more details.

Inferring Concepts in Concept Space • Image mapping: xi D(i) • Query concept mapping: cxQ(cx) • Ranking the given images:

The Whole Framework

Experiments • Dataset • NUS-WIDE LiteVersion (55,615 images) • Low-level Features • Color Histogram (CH) and Edge Direction Histogram (EDH), combine directly. • Evaluation • 81 concepts • AP and MAP

Experiments Ex1: Comparisons among Different Learning Methods

Experiments • Ex2: Concept Inference with and without Concept Space

Experiments Ex3: Inference with Tags vs. Inference with Ground-truth We can achieve an MAP of 0.1598 by inference from tags in the concept space, which is comparable to the MAP obtained by inference from ground-truth of training labels.

Summarization • Exploited the problem of inferring semantic concepts from community-contributed images and their associated noisy tags. • Three points: • Sparse graph based label propagation • Noisy tag handling • Inference in a low-semantic-gap concept space

Future Work • Training set construction from the web resource

Thanks! Questions?

Inferring Semantic Concepts from Community- Contributed Images and Noisy Tags

Inferring Semantic Concepts from Community- Contributed Images and Noisy Tags

Presentation Transcript

inferring from a general principle

Inferring:

Inferring from the Crowd

Inferring

Inferring Function From Known Genes

Inferring

Inferring and Predicting

Inferring

Inferring and Evaluating

Inferring and Predicting

Intro to HTML5 Semantic Tags

Linking Images to Concepts

Inferring Structure Information from Typography

LEARNING FROM NOISY DATA

INFERRING

Images from MSR Community Technologies Group

Semantic Data Modeling Concepts

Semantic Web: Collaboration and Community

Possession Tags , CITES Tags, and Shipping Tags

Inferring structure from data

Scrape Images Regularly from Pinterest Community

Vision: Inferring Information from Clues