860 likes | 1.09k Views
The Multimedia Semantic Web. Bill Grosky Multimedia Information Systems Laboratory University of Michigan-Dearborn Dearborn, Michigan. Contents. Introduction CBR – Where are we? Multimedia annotation Context-rich environments Semantic web Our work Anglograms Finding latent semantics
E N D
The Multimedia Semantic Web Bill Grosky Multimedia Information Systems Laboratory University of Michigan-Dearborn Dearborn, Michigan
Contents • Introduction • CBR – Where are we? • Multimedia annotation • Context-rich environments • Semantic web • Our work • Anglograms • Finding latent semantics • Using text for improved image search • Using images for improved text search • Web page structure • A cross-modal theory of linked document semantics
CBR – Where are We? • Development of feature-based techniques for content-based retrieval is a mature area, at least for images • CBR researchers should now concentrate on extracting semantics from multimedia documents so that retrievals using concept-based queries can be tailored to individual users • The semantic gap • (Semi)-automated multimedia annotation
Multimedia Annotation • Multimedia annotations should be semantically rich • Multiple semantics • A social theory based on how multimedia information is used • This can be discovered by placing multimedia information in a natural, context-rich environment
Context-Rich Environments • Structural context – Author’s contribution • Document’s author places semantically similar pieces of information close to each other • User can cluster together semantically similar pieces of information • Dynamic context – User’s contribution • Short browsing sub-paths are semantically coherent
Context-Rich Environments • The WEB is a perfect example of a context-rich environment • Develop multimedia annotations through cross-modal techniques • Audio • Images • Text • Video
Semantic Web • This program overlaps another very important current research topic, the semantic web • Web page annotations are the backbone of this research effort • We have something very important to offer to this area • Multimedia documents • Deriving multiple semantics for a single document • Combining our efforts will enrich both communities
Semantic Web • “The Semantic Web is a new initiative to transform the web into a structure that supports more intelligent querying and browsing, both by machines and by humans. This transformation is to be supported through the generation and use of metadata constructed via web annotation tools using user-defined ontologies that can be related to one another.” Somewhere on the web
End User Semantic Web Ontology Articulation Toolkit Agents Ontology Construction Tool Ontologies Community Portal x C D Inference Engine Web-Page Annotation Tool Annotated Web Pages Metadata Repository Based on www.semanticweb.org
Semantic Web • Plan a vacation within the next month • Bill instructed his semantic web agent through his handheld browser. • An agent retrieved Bill’s vacation profile from his travel agent, retrieved Bill’s availability from his calendar, checked availability of airlines, hotels and restaurants, and made all the necessary arrangements.
Semantic Web • Multimedia semantic web • Plan a vacation close to where is being exhibited.
Contents • Introduction • CBR – Where are we? • Multimedia annotation • Context-rich environments • Semantic web • Our work • Anglograms • Finding latent semantics • Using text for improved image search • Using images for improved text search • Web page structure • A cross-modal theory of linked document semantics
Anglograms • Image object • Entire image • Some meaningful portion of an image • semcon • Point-based features • corner points • color histograms
Anglograms Point feature map for shape
Anglograms Point feature map for color
Anglograms Voronoi diagram of n = 18 sites
Dual graph of a Voronoi diagram Delaunay triangulation of n = 18 sites Anglograms
Anglograms • Delaunay triangulation of a set of n points • O(n log n) algorithm • Invariance of Delaunay triangles of a set of points to • translation • rotation • scaling
Anglograms • Spatial layout of point set • Anglogram • Computed by discretizing and counting the angles of the Delaunay triangles • Which angles are counted? • O(max(n #bins)) algorithm • What is bin size?
A set of 26 points Delaunay triangulations of the point set and its two transformed variants
Anglograms • Computation of color anglogram of an image • Divide image evenly into a number of M*N non-overlapping blocks • Each individual block is abstracted as a unique feature point labeled with its spatial location and dominant colors
Anglograms • Computation of color anglogram of an image • Point feature map • Normalized feature points, after adjusting any two neighboring feature points to a fixed distance • Construct Delaunay triangulation for each set of feature points labeled with identical color
Anglograms • Computation of color anglogram of an image • Compute anglogram based on each Delaunay triangulation • Color anglogram for image • Concatenating all the anglograms together
Anglograms Pyramid image
Anglograms Hue component
Anglograms Saturation component
Anglograms Point feature map
Anglograms Feature points of hue 2
Anglograms Delaunay triangulation of hue 2
Anglograms Delaunay triangulation of saturation 5
Anglograms Anglogram of saturation 5
Contents • Introduction • CBR – Where are we? • Multimedia annotation • Context-rich environments • Semantic web • Our work • Anglograms • Finding latent semantics • Using text for improved image search • Using images for improved text search • Web page structure • A cross-modal theory of linked document semantics
Finding Latent Semantics • We want to transform low-level features to a higher level of meaning • Used for dimension reduction in QBIC • Searching in high-dimensional spaces • More importantly, it creates clusters of co-occurring features • So-called concepts
Finding Latent Semantics • Latent Semantic Analysis (LSA) was introduced to overcome a fundamental problem in textual information retrieval • Users want to retrieve on the basis of conceptual content • Individual words provide unreliable evidence about conceptual meanings • Synonymy • Many ways to refer to the same object • Polysemy • Most words have more than one distinct meaning
Finding Latent Semantics • Searching for documents concerning automobiles • Tend to use the key-word automobile • A statistical analysis determines that the key-words automobile and car tend to co-occur • LSA will retrieve documents in which the key-word car appears, but not the key-word automobile
Finding Latent Semantics • Term-document association • It is assumed that there exists some underlying latent semantic structure in the data that is partially obscured by the randomness of term choice • By semantic structure we mean the correlation structure in which individual terms appear in documents • Semantic implies only the fact that terms in a document may be taken as referents to the document itself or to its topic • Statistical techniques are used to estimate this latent semantic structure, and to get rid of obscuring noise
Finding Latent Semantics • Singular-value decomposition (SVD) • Take a large matrix of term-document association • Construct a semantic space wherein terms and documents that are closely associated are placed near to each other • SVD allows the arrangement of space to reflect the major associative patterns and ignore smaller, less important influence • As a result, terms that did not actually appear in a document may still end up close to the document, if that is consistent with the major patterns of association • Position in the space serves as the semantic indexing • Retrieval proceeds by using the terms in a query to identify a point in the semantic space, and documents in its neighborhood are returned as relevant results
Finding Latent Semantics • Term-document matrix • d documents • t terms • Represented by a t d term-document matrix A • Each document is represented by a column • document vector • Each term is represented by a row • term vector
Finding Latent Semantics • SVD is a dimension reduction technique • Reduced-rank approximation to both column space and row space • Find a rank-k approximation to matrix A with minimal change to that matrix for a given value of k • This decomposition exists for any matrix A
Finding Latent Semantics • SVD of a term-document matrix A • A = U VT • A is t d • U is a t r orthogonal matrix, where r is rank(A) • The columns of U are a basis for the column space of A • U is the matrix of eigenvectors of the matrix AAT • is an r r diagonal matrix having singular values 1 2 … r of A in order along its diagonal • 2 is the matrix of eigenvalues of AAT or ATA • VT is a r d orthogonal matrix • The rows of VT are a basis for the row space of A • V is the matrix of eigenvectors of the matrix ATA
Finding Latent Semantics t d t r r r r d
Finding Latent Semantics • A special rank-k approximation, Ak • Ak = Uk k VkT • Uk • First k columns of U • k • First k diagonal values of • VkT • First k rows of VT
Finding Latent Semantics • Reduce the rank to 3
Finding Latent Semantics Query Score
Finding Latent Semantics Query Score
Contents • Introduction • CBR – Where are we? • Multimedia annotation • Context-rich environments • Semantic web • Our work • Anglograms • Finding latent semantics • Using text for improved image search • Using images for improved text search • Web page structure • A cross-modal theory of linked document semantics