1 / 65

MediaView: A Semantic Multimedia Database Model

This paper presents MediaView, a model that bridges the semantic gap between multimedia systems and databases for effective management of multimedia data. It explores the challenges of semantic modeling in multimedia and introduces a solution to address the context-dependency and modality-independency of semantics. The MediaView mechanism and its basic concepts, including media views and semantic graphs, are discussed along with view operators for query processing and navigation. Real-world applications and the future potential of MediaView are also explored.

jbridges
Download Presentation

MediaView: A Semantic Multimedia Database Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MediaView -- Towards a “Semantic” Multimedia Database Model Qing Li Dept of Computer Science City University of Hong Kong

  2. Outline • Motivation & Introduction • Modeling Constructs • Logical Implementation • Real-World Applications • Conclusion

  3. State-of-the-art • Multimedia Systems and Applications • an explosive growth in recent years • demand on managing multimedia using databases • Database techniques for multimedia • data modeling • indexing • query processing • presentation & synchronization

  4. “Semantic Gap” semantics-intensive multimedia systems & applications non-semantic multimedia data models Semantic Gap require model raw data,primitive properties (size, format, etc) semantic meaning of the data

  5. Semantic modeling of multimedia -- Why hard? • Context-dependency • Semantics is not a static and intrinsic property • The semantics of an object often depends on: • the application/user who manipulate the object • the role that the object plays • other objects in the same “context” Example: Van Gogh’s paintings flower

  6. Why hard? (cont.) • Modality-independency • Media objects of different modalities may suggest the similar/related semantic meanings. • Example: Query: Results: Harry Potter has never been the star of a Quidditch team, scoring points while riding a broom far above the ground. He knows no spells, has never helped to hatch a dragon, and has never worn a cloak of invisibility. image video text

  7. MediaView– A “Semantic Bridge” • An object-oriented view mechanism that bridges the semantic gap between multimedia systems and databases • Core concept –media view (MV) • a customized context for semantic interpretation of media objects (text docs, images, video, etc) • collectively constitute the conceptual infrastructure of an multimedia system & application

  8. Architecture MediaView Mechanism

  9. Basic Concepts So, a media view MVi can be represented as a triple: MVi= <Mi, Pi, Ri,> Where: Mi - a set of objects that are included into MVi as its members. Each object o∈Mi belongs to a certain source class, and different members of MVi may belong to different source classes. Piv - a set of properties (attributes and methods) applied on either MVi itself (Piv) or on all the members (Pim). Ri- a set of relationships, and each r∈Ri is in the form of <oj, ok, t>, which denotes a relationship of type t between member oj and ok in MVi; Ri itself may exhibit a “graph”.

  10. Basic Concepts An example…

  11. Basic Concepts Semantics-based data reorganization via media views

  12. Basic Concepts Definition 5: The semantic graph (SG) is an undirected graph G={V, E}, where V is a finite set of vertices and E is a finite set of edges. Each element ViV corresponds to a multimedia object Oi in the database. E is a ternary relation defined on V×V×N. Each e=<Vi,Vj, n>E represents a semantic link of degree n between object Oi and Oj, where n is the number of media views to which both objects belong. We define n as the correlation factor between Oi and Oj.

  13. Basic Concepts Definition 6: The correlation matrixM=[Mij] is an adjacency matrix of the semantic graph. Specifically, each element Mijcontains the correlation factor between Oi and Oj, with all the diagonal elements set to be zero.

  14. Basic Concepts Semantic Graph Model

  15. View Operators • A set of operators that take media views and view instances as operands. • Our intension is not to come up with a complete set of operators, but to focus on those that are indispensable in supporting queries and navigation over multimedia objects.

  16. View Operators type-level V-overlap syntax<boolean>:= v-overlap (<media view1, mediaview2 >) semanticstrue, if and only if ( o  O)(oextent(<media view1>) andoextent(<media view2>)) Cross syntax{<object>}:= cross (<media view1, media view2 >) semantics{<object>} := {o  O | o extent(<media view1>) andoextent(<media view2>)} Sum syntax{<object>}:= sum (<media view1, meida-view2 >) semantics{<object>} := {o  O | o  extent(<media view1>) oroextent(<media view2>)} Subtract syntax{<object>}:= subtract (<media view1, media view2>) semantics{<object>}:= {o  O | o extent(<media view1>) andoextent(<media view2>)}

  17. View Operators instance-level Class syntax<base class> := class(<view instance>) semantics<view instance> is a instance of <base class> components syntax{<object>} := components (<view instance>) semantics {<object>} := { oO | o is a component (direct or indirect) of <view instance>} i-overlap syntax<boolean> := i-overlap (<view instnace1>, <view instance2>) semanticstrue, if and only if ( o  O) (o  components (<view instance1>) and o  components(<view instance2>))

  18. View Algebra • Functions -- derivation of new MVs from existing MVs Heuristic Enumeration • Blind enumeration • Content-based enumeration • Semantics-based enumeration

  19. View Algebra • Algebra Operators • select from src-MV where <predicate> • project <property-list> from src-MV • intersect (src-MV1, src-MV2) • union (src-MV1, src-MV2) • difference (src-MV1, src-MV2)

  20. Comparison (vs. class)

  21. Comparison (vs. traditional object view)

  22. Logical Implementation • MediaView Construction • MediaView Customization • MediaView Evolution

  23. MediaViews Construction • Work with CBIR systems to acquire the knowledge from queries • Learn from previously performed queries • A multi-system approach to support multi-modality of media objects • Organize the semantics by following WordNet

  24. Why WordNet? • Different queries may greatly vary with the liberty of choosing query keywords • We need an approach to organize those knowledge into a logic structure • A simple “context”: a concept in WordNet • Common media views: corresponds to simple contexts • We provide all common media views, based on which users can build complex ones.

  25. Navigating the Multimedia Database • Navigating via semantic relationships of WordNet Semantic Relationship Examples Synonymy (similar) pipe, tube Antonymy (opposite) fast, slow Hyponymy (subordinate) tree, plant Meronymy (part) chimney, house Troponomy (manner) march, walk Entailment drive, ride

  26. Navigating the Multimedia Database

  27. MediaViews Construction

  28. Two level MediaView Framework MediaView Customization

  29. MediaView Customization • Dynamically construct complex-context-based media views based on simple ones • An example complex context: “the Grand Hall in City University” • Several user-level operators are devised to support more complex/advanced contexts, besides the basic operators

  30. User-level Operators • INHERIT_MV(N: mv-name, NS: set-of-mv-refs, VP: set-of-property-ref, MP: set-of-property-ref): mv-ref • UNION_MV(N: mv-name, NS: set-of-mv-refs): mv-ref • INTERSECTION_MV(N: mv-name, NS: set-of-mv-refs): mv-ref • DIFFERENCE_MV(N1: mv-ref, N2: mv-ref): mv-ref

  31. Build a MediaView in Run-time • Example: find out info about "Van Gogh" • Who is "Van Gogh"? • What is his work? • Know more about his whole life. • Know more about his country. • See his famous painting "sunflower"

  32. Build a MediaView in Run-time • Who is “Van Gogh”? • INHERIT_MV(“V. Gogh“, {<painter>},name=”Van Gogh” ,); • What is his work? • INTERSECTION_MV(“work”, {<painting>, vg}); • Know more about his whole life. • INTERSECTION_MV(“life”, {<biography>, vg}); • Know more about his country. • INTERSECTION_MV(“country”, {<country>, vg}); • See his famous painting “sunflower” • Set sunflower = INTERSECTION_MV(“sunflower”, {<sunflower>, <painting>});Set vg_sunflower = INTERSECTION_MV(“vg_sunflower”, {vg_work, sunflower});

  33. Authoring Scenario • Creates a new media view named after the subject • All multimedia materials used in the document would be put into this MediaView for further reference. • To collect the most relevant materials for authoring, the user performs the MediaView building process. • Import suitable media objects by browsing media views • Reference the manner and style of authoring, to find other media views with similar topics. • Drag & Drop • “learning-from-references”

  34. Interface of Our Authoring System

  35. System Features • A Dynamic Environment • Helps a user select materials from the database to incorporate into the document • Query other similar media views for referencing the manner and/or style of authoring

  36. Real-World Applications • A Multimedia Recipe Database • Modeling basis • Personalized (context-aware) manipulation • Cross-media indexing and retrieval system • Novel way of annotating and retrieving media objects • Lead to new indexing strategies

  37. A Personalized Recipe Database System • People can not live without foods • Existing recipe websites provide huge amounts of recipes throughout the world • Fail to give support on analyzing and comparing recipes (What are important cooking principles & skills; what makes two dishes’ taste so different, etc.) • Unable to help users find similar recipes in a comprehensive manner (only keyword-based search on recipe names) • Fail to adapt recipes to meet the real-world situation (e.g. due to lack of ingredients or user preference)

  38. A Personalized Recipe Database System -- Our Contributions • Propose a recipe model which encompasses static attributes as well as dynamic behaviours (e.g. cooking procedures and constraints) • Present a novel perspective of evaluating the “quality” of a recipe by constructing and analysing its cooking graph (capture both action flows and data/ingredient flows) • Provide a promising way to address the problem of recipe adaptation heuristically (with flexible and feasible solutions)

  39. Recipe on the Web

  40. Sample Recipe -- The Cooking Procedure of “Triple Cheese Pasta Primavera”

  41. Sample Recipe Parsing the Cooking Procedure of “Triple Cheese Pasta Primavera”

  42. Recipe Model • A recipe R is modeled and represented by a tuple of three elements: R = <M, RP, SP> where • (a) M={Mi | i = 1.. m} – a set of ingredients. An ingredient Mi is either a basic ingredient or a set of ingredients: • Mi = <MID, MP>, MID—unique identity, MP—member level properties (and functions) such as the name, quantity and image • An ingredient Mi belongs to one of the three classes: Main, Minor and Seasoning; • (b) RP is a set of recipe-level properties (and functions) applied on R itself, such as the main cooking style, region, nutrition and images of the dish of the recipe;

  43. Recipe Model • (c) SP = (V, E, Cons, Ingr) is a labeled directed“Cooking Graph”, • V={vi | i = 1..n} is a set of nodes. vi—a cooking action “cooking action constraints”: Cons(vi)—associated constraint conditions that should be satisfied when the action of vi takes place. e.g. conditions on temperature and duration etc. • E is a set of directed edges on V—temporal execution flow of the cooking actions;named “action flows”. An edge <vi ,vj> —vj should take place after vi. “cooking transition constraints”: Cons(vi , vj) –the conditions that should be satisfied for the flow to take place. • Ingr(vi) – ingredientsthat should be added into vi O(vi) –the output ingredients of vi These inputs and outputs for the nodes are called “ingredient flows”.

  44. Cooking Graph The Cooking Graph of “Triple Cheese Pasta Primavera”

  45. Basic Properties • Definition 1. (Reachability) A cooking graph is defined as “reachable” if each of its nodes is “reachable”; a node is “reachable” if it is on a directed path from a starting node to the end node. • Definition 2. (Consistency) A cooking graph is defined to be “consistent” if the conditions for each node/edge is consistent (i.e. there exists assignment to variables to make the conditions true).

  46. Constraints and Rules • Definition 3. (Constraint) A constraint is a predicate followed by one or more terms, enclosed in parentheses and separated by commas; a term is either a constant, variable or function expression. • Constraints specify all kinds of conditions or restrictions in the recipe model; • Three categories: intra-recipe constraints, inter-recipe constraints and outer-recipe constraints. • Incompatible(Spinach, Tofu) says spinach and tofu are incompatible and should not be cooked together.

  47. Constraints and Rules • Definition 4. (Rule) A rule is a logical implication of the form “If Ф Then Ψ” (or, ), where Ф and Ψ are sentences. • Validate the correctness of a recipe through reasoning and recognition process. • Handle complex situations such as to make necessary adjustment or compensation once an improper cooking action occurs. • Describe cooking skills that have been widely accepted and commonly used. • Over_Put(salt) → Add(vinegar|water) says that if too much salt has been put into a dish, then neutralize the salty taste by adding either vinegar or water.

  48. Recipe Cooking Graph Mining • Pattern — Some subgraphs occur in one or more cooking graphs and they have certain influence on the cooking effects (e.g. taste, appearance). • Find patterns for a set of recipes • What’s usually done and what’s usually put in the cooking procedure (one action, a series of actions, an ingredients, a set of ingredients, actions combined with ingredients) • Cooking graphs of different recipes may share the same pattern • Distinct subgraphs that determine the cooking effect (e.g. taste) should be identified

  49. Sample Patterns

  50. Sample Cooking Style Generally describe how a recipe is cooked in a Pattern Combination or in Graph Abstraction.

More Related