440 likes | 499 Views
Multimedia Information Retrieval. It is of the highest importance, in the art of detection, to be able to recognize out of a number of facts which are incidental and which are vital ... -Sherlock Holmes Presented by Dipti Vaidya. Overview of the Presentation.
E N D
Multimedia Information Retrieval It is of the highest importance, in the art of detection, to be able to recognize out of a number of facts which are incidental and which are vital ...-Sherlock Holmes Presented by Dipti Vaidya
Overview of the Presentation • Overview of MultiMedia Information Retrieval • Research Issues & Approaches • Case Study of 2 of these Approaches: - Content Based Approximate Picture Retrieval - Knowledge Based Approach for Retrieving Images by Content • What next ?
Multimedia Information Retrieval • Types of Associated Information • Content-independent metadata (CIM) • Format,author’s name,date • Content-dependent metadata (CDepM) • Low level features concerned with perceptual facts eg:color,texture,shape,spatial relationship • Content-descriptive metadata (CDesM) • High level content semantics eg: good weather
New Generation MMIR • Retrieval not only by concepts but also by perception of visual content • Objective measurements of visual contents and appropriate similarity models • Automatically extract features from raw data by image processing, speech recognition,pattern recognition and other computer vision techniques
Image Retrieval • By perceptual features • for each image in database,a set of features are computed • to query the image database, through visual examples authored by user or extracted by image samples • select features and choose a similarity measure • Compute similarity degree, ranking and relevance feedback
Research Issues • Extraction of perceptual feature (CDepM) • Color,Texture,shape:Because of perception subjectivity,there doesn’t exist a single best presentation for a given feature. • Segmentation Comparison of existing methods:
Comparision of Existing techniques • Histogram Matching : big in size,complicating database creation,sensitive to brightness creation,lacks spatial information,difficult to create partial match • Texture Mapping : not applicable to non textured regions,hard to figure out what model to use for a given application • Region based mapping: results can go ugly when segmentation is not done properly
Research Issues • To make CBIR truly scalable to large size image collections, efficient multidimensional indexing techniques needs to be explored • Retrieval Speed • Multi-Dimensional Indexing techniques: - Bucketing Algorithm - K-d tree - K-D-B tree - R- tree and it’s variants R+ tree and R*- tree Problems: These techniques cluster data based on minimizing the number of disk access per data retrieval, but do not consider semantic difference of image of image features; thus no global conceptual view of image clustering can be provides eg: LARGE, NEARBY tumor ????
MMIR Systems • QBIC • Visual Seek and WebSEEK • MARS • Photobook
Content-Based Approximate Picture Retrieval -A Prasad Sistla, Clement Yu at UIC
Problem Formulation • The objective is to find the pictures stored in the system that are very similar to the user’s description • A similarity measure is used to find the degree by which two descriptions match
Solution Approach • There is some metadata associated with each picture in our database which describes the content of each picture • Metadata contains information about the objects in the picture,their properties and the relationships among them • A knowledge base containing a thesaurus and certain rules is used for some deductions
Representation of Pictures A picture can be represented by an E-R graph as follows: • Entities: objects identified in a picture • Relationships: associations between the identified objects • Attributes:properties (color,size etc) that qualify of characterize objects
Example Example: Bridge Over Right of RIVER TREE blue tall Left Of HILL green short
Relationship Classification • Action • Mutual eg: handshake • Directional eg: moving • Spatial • Mutual eg: adjacent • Directional eg: above • Subject Entity • Object Entity
User Interface Queries are entered by the user via the interactive interface • Identify all the objects • Identify various attributes of the object eg:color, size • Choose the appropriate relationship from the list provided • Visual language also provided to specify part of a query
Mapping Queries to Formal Language Expressions Using UniSQL/X Example: “ Find the picture(s) of president Clinton” SELECT P.pid FROM Picture P WHERE P.Entities[X] And X.Cname =‘president’ and X.Pname=‘Clinton’ Where Cname = Common Name and Pname = Proper Name Plan to write 2 more query examples…..
RETRIEVAL • Two pictures are similar if both the corresponding entities and the corresponding relationships are similar Similarity of 2 Entities: • The name of the entities are the same, synonyms or in an IS-A relationship hierarchy • The attribute values of the 2 entities do not conflict
Similarity of 2 Entites Let e be an user specifies entity and E be an entity stored in the system e = {n,a1,a2,…ak} E = {N,A1,A2,…,AK} Where n and N are the name of the entities e and E respectively and ai and Ai are the ith attribute values of the corresponding entities The similarity of entities can now be defined as: k • Sim (e,E) = ½ (Sim (n,N) + 1/kSim (ai,Ai)) • i=1
Similarity of 2 Entity names • The 2 names are same => Sim(n,N) = 1 • The 2 names are synonyms => Sim(n,N) >0 3. The 2 names are antonyms => Sim(n,N)= -inf 4. One of the names is NULL or both are NULL=> Sim(n,N) = 0
Similarity of 2 Attributes We say that an attribute is neighborly if there is a closeness predicate that determines for any 2 values of the attribute whether the 2 values are close or not. For Eg: The “age” attribute is neighborly it can take one of the values- “very young”, “young”,”middle age”,”old” and “very old” Two values are considered “close” if they occur next to each other in the list
Similarity of 2 neighborly attributes: • Sim (a,A) = w if a = A • Sim (a,A) = cw if a or A are close • Sim (a,A) = -inf if a and A are not close Here, w is determined using the inverse document frequency method & c is a positive constant less than 1
Similarity of 2 Relationships Let r be the relationship of some entities which are specified in the user’s description and R be another relationship of some entities which are given in a picture stored in the system r = {n,(e1,t1),(e2,t2),…(em,tm)} R= {N,(E1,T1),(E2,T2)…..(Em,Tm) T = type (O-Object, S-Subject) m m • Sim (r,R) = 1/3 (Sim (n,N) + 1/m Sim(ei,Ei) +1/m Sim (ti,Ti)) i = 1 i =1 Sim(ti,Ti) = 1 if ti = Ti = 0 if one is type N and other is not = - inf if one is type S and other is type O
A sample Retrieval blue moving 1 • E-R Diag (Query) Lu,mu,ru Black,white medium River Mountain moving Ll,ml,rl Sail Boat 6 still brown large In by 2 4 Tree 5 Sailor Fisherman rl ¦ still green black small still white still small
A Sample Retrieval 2 E-R Diag (DB-pic1) moving blue Lm,mm,rm Rm,ru 3 Right of 1 Over Bridge River Tree small green brown mm stationary still Above Left Of brown blue 4 5 sky Mountain Lm,mm stationary Lu,um,ru stationary
A Sample Retrieval stationary 25 3 moving blue moving E-R Diag (DB-pic2) black tree Behind in river Sail boat 2 1 green Ll,ml,rl Lm,mm,rm behind 7 In On person beach Ll,ml,rl black Person stationary 5 Lu,mu,ru ll 3d-irregular Mountain still white 6 brown Very large
Research Issues • Not very feasible for large databases • Coding of the descriptions of the pictures is a labor intensive task • Handling type matches eg:clouds “in” the sky or “cloudy” sky • Deduction of Spatial Relationships
A Knowledge_Based Approach for Retrieving Images by Content Chih-Cheng Hsu, Wesley Chu -UCLA
Overview • A Knowledge-based spatial image model (KSIM) which supports queries with semantic and similar-to predicates • Interested objects in the images are represented by contours segmented from images • Image content are these object contours using domain specific knowledge • These image features are classified using MDISC and represented by TAH for knowledge-based query processing
KSIM • A three-layered model is used to integrate the image representations and image features together with image content interpretation knowledge - The Representation Layer (RL) - The Semantic Layer (SL) - The knowledge Layer (KL)
Image Representation • Raw images are stored in RL • Image objects are represented by contours which can be segmented manually or semi-automatically in the RL Difficulty: automated segmentation of these objects is still not achieved which leads to the deployment of this technique.
Semantic Layer • The shape model and spatial relationship model in the SL are used to extract image features from the contours Example: Object feature - Conceptual terms Tumor.size small,medium,large Tumor.roundness circular,non-circular Lateral_ventricle. Symmetric, upper_protrusion_pressed_right L_R_Symmetry upper_protrusion_pressed_left
Segmentation Layer • Spatial relationship table: Spatial RelationshipRep.featuresdefined semantic term SR(t,b) (xc,yc,ia) slightly occupied, extremely occupied SR(t,l) (oc,dc,xc,yc) nearby,far_away …. ….. …..
Mapping Queries to Formal Language Expression using CoBase Example Query: “find large tumor nearby the lateral ventricle” Select patientWithImage(patient: t.patient,image:t.image) From Tumors t, Lateral_ventricle l Where t NEARBY l and t.size IS ‘large’
Query Interpretation via TAH • The concept in the TAH node is represented as the value range of the features • The TAH nodes can be labels with the conceptual term (eg. large, small) to represent specific knowledge • There is a TAH directory that stores such information as object names,set of features, spatial relationships ,user type,purpose of TAH,etc • Based on this information, the system selects and retrieves the appropriate TAH for processing the query
TAH • Example:
Query Processing • The query analysis and feature selection phase • The knowledge based content matching phase • Query relaxation phase
Flow diag of query processing • Query Query Processing Satisfactory answers Post Processing Relaxation manager TAHs, user model Query modification
A Sample Retrieval Example • User: brain surgeon • mandatory matched objects:Lesion and Brain • optional matched objects: lateral ventricle and frontal lobe • relaxation order: SR(l,lv) and SR(l,f) are more important than SR(l,b) in order.. 1 SR(l,f) Frontal Lobe SR(l,b) Lesion brain 2 SR(l,lv) Lateral Ventricle 1
Research Issues • Use XMLMetadata for describing images and the use of this data for automatic generation of the web pages.