520 likes | 663 Views
Semantic Multimedia. Steffen Staab, Univ. Sheffield March 28, 2006. My private challenge. ….and more than 17,000 other images and mini-movies. Send family recent christmas photos… Send friends pictures that include them…
E N D
Semantic Multimedia Steffen Staab, Univ. Sheffield March 28, 2006
My private challenge ….and more than 17,000 other images and mini-movies
Send family recent christmas photos… Send friends pictures that include them… Ask for pictures that depict my children at carnival with a big smile in order to make a presentation about semantic multimedia… Show a friend where we live… Exchange opinions about what are the best shots Record „photo copies“ of signed contracts Query for all architecture images built for X-Media proposal… List to be continued… Store pictures in a folder Query for picture name and date What I would like to do… vs What I can do The Semantic Gap
Strategies for Narrowing the Semantic Gap • Image understanding • Scene classification • People recognition (that, who) • Artifact recognition • Context understanding • Shared Annotation • User Feedback • Highly domain-dependent, but so far: • little domain knowledge Google Flickr Little explored wrt Semantics (e.g.Santini etal.)
Annotation is expensive as a dedicated process Allow for on-the-fly annotation from all existing applications Motivation Semantically Enabled App Arbitrary Application Semantic Metadata Repository Arbitrary Metadata Format
Retrieval strategies cannot be easily anticipated by users No fixed schema Motivation
Allow for on-the-fly annotation from all existing applications Annotation through a filesystem interface No fixed schema Avoid hierarchical filesystem organisation Concept
Use the Filesytem to Link Arbitrary Applications with a Metadata Repository Semantically Enabled App Arbitrary Application ? Virtual Filesystem OS Kernel Semantic Metadata Repository WebDAV FAT32
Queries filesystem paths correspond to queries each directory translates to a parametrized view on the metadata repository views can be nested Example “/tag beach/depictssteffen“ translates into depicts(„steffen“, tag(„beach“, /)) “/” represents metadata repository Metadata Every file is associated with tags given by names of super directories Allowing for linking of directories Multiple hierarchies Directories represent… at creation time at exploration time
listing a directory (= view) returns all information objects corresponding to the view information objects may be files, bookmarks, chapters, ... Class handlers implement file system operations (read, write, ...) for a class of information objects (files, bookmarks,…) RDFFS maps filesystem operations to operations on a RDF-repository Views and Class Handlers provide tagging for files Annotated Information Objects
Architecture II Semantically Enabled App TagFS Arbitrary Application Arbitrary Application Views Class Handlers RDFFs Virtual Filesystem OS Kernel Fuse Semantic Metadata Repository WebDAV FAT32
Take Picture of Steffen and Simon in Sheffield and store it at /tag image/tag sheffield/depicts simon/picture1 Take Picture of Steffen in Rome and store it at /tag rome/depicts steffen/tag image/picture2 Record video of Sheffield and store it at /tag video/tag sheffield/video1 notice that picture1 also depicts Steffen. link picture1 to /depicts Steffen Usage Examples: Annotate
all documents related to Sheffield: /tag sheffield picture1 video1 all images depicting Steffen: /tag image/depicts steffen picture1 picture2 Usage Examples: Retrieve
Use Case: Virtual Organizations • Project members need to share, i.e. distribute and retrieve, confidential information among each other • Different members have different roles, e.g. manager, researcher, that require different views onto the shared data Lucy@Sheffield Sergej@Koblenz X-Media X-Media Deliverables Contracts Work Package 1 Work Package 2
Use Case: Image Sharing • User X has many images • X wants to share some images publicly, some only with dedicated persons, and some not at all • Due to the amount of images, uploading many images to a central repository is not an option
Use Case: Image Sharing • Neither X nor X's friends want to pay for a dedicated server or hand over their images to a server managed by a 3rd party • They would like to user their own storage
SEA Purpose: • Decentralizedinformation sharing, e.g. image sharing • Tagging as means for • Personal and collaborative organization of information • Information retrieval • Access control
SEA: Architecture • RDF store for meta data • DHT implementation for efficient distribution of shared information
SEA: Features • Autonomy for information distribution and sharing • Flexible information organization • Simple setup and administration of sharing environment • Privacy, data security • Ad-hoc collaboration
Centralized vs. Distributed Sharing with SEA Conventional informationsharing characterized by centralization SEA follows a distributed approach
Access Control in SEA • Access control mechanisms allow to define with whom to share data • based on taggings • e.g. everything tagged as „public“ is public • e.g. everything tagged as „forSteffen“ is accessible for Steffen • based on rules for access
SEA: Data Model • Ontological meta model
Solution Better use context and background knowledge
Region-Based Image Labelling • Find semantically meaningful regions • Label them with concepts • Infer higher level annotations from initial labellings • Provide user-centred, semantic annotation The overall aim is to improve the access to multimedia content.
Scene classification Image Labeling: 1. Initial, region-based Output: • Segment Classification • Hypothesis set of possible labels for image segments • Degree of confidence
Scene classification Image Labeling: 1. Initial, region-based Output of Person/Face Detection: • Bounding boxes for detected persons/faces • Degree of confidence
B2={l1,l2} A1 A2 A1 over A2 A1 overlaps B2 … Spatial & topological information Multi-Tier Image Model Segments Label hypotheses Confidence values Bounding boxes Classification of picture ….
Multimedia Reasoning • Aim, now: • integrate available information towards • global, • consistent and • user-oriented annotation • 3 tasks: • Consistency Checking • Region Merging • Generation of a higher-level, user-centered annotation Current Focus
Constraint Satisfaction Problem (CSP) Check that label(s) of a region are consistent wrt labels of neighboring regions Ideally: Leaves one correct label per region More often: more than one label remains decision in favor of highest confidence values Process consists of Transformation of multi-tier description into a CSP Application of constraint reasoning to solve the CSP Computing the “best” labeling using the confidence values Consistency Checking
Constraint Satisfaction Problem (CSP) Definition • Consists of set of variables and set of constraints relating several variables • Each variable may have values from it’s domain • A constraint defines which values can be assigned to a variable depending on the related variables • Standard methods exist to solve the CSP • Two steps: • Consistency checking, i.e. removal of values from the domain that never satisfy the constraints • Computation of full solutions using search algorithms(i.e. model generation)
Constraint Satisfaction Problems Example: • Variables: x, y, z • Domains: D(x) = {1, 2, 3}, D(y) = {2, 3, 4}, D(z) = {2, 3, 4, 5} • Constraints: x >= y, y >= z • After consistency checking: D(x) = {2, 3}, D(y) ={2, 3}, D(z) = {2, 3} • Concrete Solutions (models): (2, 2, 2), (3, 2, 2), … • Not all possible combinations of domain values are a solution!
Image Labelling as a CSP • Each segment s is transformed into a variable vs • Initial Labellings Ls are the domains of the segment variables, D(vs)=Ls • For each spatial relation type, a constraint sp-rel(v,w) is defined • the spatial constraints define which value combinations are legal for the given relation • e.g. left-of(v,w):={(sea,sea),(sky,sky)}, but not (sky,sea) • If two segments s,t are related with a spatial relation sp-rel, a corresponding spatial constraint sp-rel(vs,vt) is instantiated
Example Initial image Segmentation Mask
Confidence Values
Sky can not be left of Sea Confidence Values Spatial Relations
Sky can not be left of Sea Confidence Values Spatial Relations
Sea can not be above Sky Confidence Values Spatial Relations
Sea can not be above Sky Confidence Values Spatial Relations
Definition of Spatial Constraints • Spatial constraints form an integral part of the domain knowledge used for multimedia reasoning • Currently they are explicitly defined by a domain expert • But: • Seems not feasible for large amounts of concepts and relations • Preferably each constraint should be accompanied by a confidence value, which can hardly be defined by an expert • Idea: • Learn constraints from pre-annotated images • Allow for later refinement during run-time by user interaction. • Planned extension of M-OntoMat-Annotizer for this purpose • http://www.acemedia.org/aceMedia/results/software/m-ontomat-annotizer.html
Current Status: • Use of segment classification • Very recently: integration of person/face detection module
Initial Evaluation • An evaluation framework for region-based image labelling was defined within the aceMedia project. • Ground Truth is defined on a grid-basis • a N x N grid is layered on top of each image • each cell is annotated with all depicted concepts • For evaluation the segments of the segmentation, or the bounding boxes, are mapped to the respective cells. • For each concept it is counted how often • the concept was found correctly, i.e. a correspondence between the segment label and a grid label is found • the concept was found in general • the concept exists in the GT • Based on these values precision and recall for each concept, and the overall process can be defined.
Evaluation Results • Since the method is based on content analysis modules, we evaluated the improvement reached by applying the constraint reasoning to the segment classification. • First, precision, recall and the F-Measure were computed for the segment classification • Then, for the CSP method applied to the initial labelling • Finally the average improvement was calculated
Evaluation Results Segment Classification Constraint Reasoning • Set Up: • Evaluation with ~60 images • A 8x8 grid was used for the ground truth • The segmentation was set up to always produce 8 segments per image • Results are promising, showing an 10% increase in average. • However, results in the overall performance are needed
Next Steps • Soft Constraint Reasoning • Fuzzy Constraints to integrate the confidence values into the reasoning • Incremental Constraints to flexibly add constraints during reasoning • both should provide for more robust results and lead to better reduction of the initial label sets • Incorporation of a region merging step • Would enable an iterative process and a knowledge-based segmentation • Derivation of a higher-level annotation • Currently a simple combination of confidence values is applied • the maximum degree for each concept is kept, and each concept is added to the final annotation • later also relations and additional concepts should be inferred