370 likes | 384 Views
Explore the automation of metadata delivery and processing with MPEG-7 standards, emphasizing the importance of high-quality metadata in multimedia search results. Learn how to transform content partner metadata into MPEG-7 for improved indexing and search relevancy.
E N D
Internet Streaming Media Metadata Interchange with MPEG-7 Eric Rehm CTO, singingfish.com Thomson multimedia 4 May 2001, Hong Kong
Overview • Brief look at Singingfish • Indexing Internet streaming media • Automating metadata delivery and processing • Case Study: Using XSL to transform MSNBC schema to MPEG-7
singingfish.com • Wholly-owned subsidiary of Thomson Multimedia • B2B Streaming Media Search Service • Pay per query business model • Over 15 M streams indexed • Live with customers since Jan 2000 • InfoSpace: Metacrawler, Dogpile • Inside Internet AG: Swiss-Search, Austria-Search • Involved with MPEG-7 standards development since Sept 1999
Indexing Streaming Media • High quality metadata improves relevancy of multimedia search results • Crawl….or…work directly with multimedia “Content Producers” to acquire quality metadata • Solution: Implement FTP push/pull of metadata • Automated processing upon FTP close • Support bulk or incremental operations: add, update, delete, reset • Future: SOAP or other W3C XML protocol
Design Content Producer Program Metadata Engine
Development Goals • Single metadata schema interface to a database • Control development costs • Partition engineering and content development • Adapt to any “content partner” metadata • XML, CSV, Excel, Virage VDF, …. • Transform “content partner” metadata to MPEG-7 via: • Custom applications (CSV, Excel) MPEG-7 • Proprietary XML schemas XSL MPEG-7
Case Study Create XSL transformation • From: • MSNBC "Partner XML Format" • To: • MPEG-7 Description
File lines chars elemnts attrs MSNBC Partner XML Example 73 1199 58 16 MPEG-7 Result 263 4471 151 74 Experimental Results • XSL Stylesheet: 370 lines of lightly commented code
Discussion • Basic MPEG-7 Tools • Semantic Encoding of MSNBC Keywords into MPEG-7 Structured Annotation DS (Who, What, Where, When, Why, How) • Encoding Controlled Terms using namespaces • Encoding Streaming Media Validity with the Availability DS • Extending an MPEG-7 DS
tdy_fletcher_mideast_001023Keywords: Israel, palestinian, Yasser ArafatTop News Order: 12Peace hopes slip fartherThe slim hopes for peace in the Mideast are rapidly fading, NBC’s Martin Fletcher reports Monday from the outskirts of Jerusalem. • Today’s show • Barak, Sharon talk coalition • What’s on Today • What’s on Weekend Today • What’s on Today MSNBC Video Distribution Entry
MSNBC <article> <article storyorder="12" pubdate="10/23/2000 8:02:00 AM" source="Today show" topnews="12"> <filename>tdy_fletcher_mideast_001023</filename> <duration>00:01:09</duration> <headline>Peace hopes slip farther</headline> <description>The slim hopes for peace in the Mideast are rapidly fading, NBC&#146;s Martin Fletcher reports Monday from the outskirts of Jerusalem.</description> <keywords>Israel, palestinian, Yasser Arafat</keywords> ...</article>
MPEG-7 link to stream <MediaInformation> <MediaProfile> <MediaInstance> <MediaLocator> <MediaUri> http://www.msnbc.com/news/asx/video/28/tdy_fletcher_mideast_001023.asx </MediaUri> </MediaLocator> </MediaInstance> <MediaProfile> <MediaInformation>
<headline> <headline>Peace hopes slip farther</headline> <CreationInformation> <Creation> <Title> <xsl:value-of select="headline"/> </Title> </Creation> </CreationInformation>
< description>, <keywords> <description>The slim hopes for peace in the Mideast ...</description> <keywords>Israel, palestinian, Yasser Arafat</keywords> <Abstract> <FreeTextAnnotation> <xsl:value-of select="description"/> </FreeTextAnnotation> <KeywordAnnotation> <Keyword>Israel</Keyword> <Keyword>Yasser Arafat</Keyword </KeywordAnnotation> </Abstract>
Enhanced <keywords> <keywords>Israel, palestinian, Yasser Arafat</keywords> <Abstract> <Who> <Name>Yasser Arafat</Name> </Who> <WhatObject> <Name>palestinian</Name> </WhatObject> <Where> <Name>Israel</Name> </Where> </Abstract>
Encoding Controlled Terms • Singingfish.com Genres are described in one namespace (urn:sf:genre). • MSNBC Genres are described in another namespace (urn:msnbc:category)
<xsl:variable name=“sfCategory" select="singingfish:mapper.map(string(category[1]/@id))"> <Genre href=“urn:sf:{$sfCategory}“ /> Encoding Controlled Terms <categories> <category id="News"> <topics> <topic>International</topic> </topics> </category> </categories> <Genre href="urn:msnbc:category:{category[1]/@id}"> <Term type="NT" termId="{category[1]/topics/topic[1]}"/> </Genre>
Extending an MPEG-7 DS <complexType name="PublicationType"> <complexContent> <extension base="mpeg7:DSType"> <sequence> <element name="Publisher" type="mpeg7:AgentType" minOccurs="0"/> <element name="PublicationLocation" type="mpeg7:PlaceType" minOccurs="0"/> <element name="PublicationDate" type="mpeg7:TimeType" minOccurs="0"/> <element name="Rights" type="mpeg7:RightsType" minOccurs="0"/> </sequence> </extension> </complexContent> </complexType> <complexType name="PublicationType"> <complexContent> <extension base="mpeg7:DSType"> <sequence> <element name="Publisher" type="mpeg7:AgentType" minOccurs="0"/> <element name="PublicationLocation" type="mpeg7:PlaceType" minOccurs="0"/> <element name="PublicationDate" type="mpeg7:TimeType" minOccurs="0"/> <element name="Rights" type="mpeg7:RightsType" minOccurs="0"/> </sequence> </extension> </complexContent> </complexType>
Extending an MPEG-7 DS <complexType name="UsageInformationType"> <complexContent> <extension base="mpeg7:UsageInformationType"> <sequence> <element name="Publication“ type="sf:PublicationType" minOccurs="0"/> </sequence> </extension> </complexContent> </complexType> <complexType name="UsageInformationType"> <complexContent> <extension base="mpeg7:UsageInformationType"> <sequence> <element name="Publication“ type="sf:PublicationType" minOccurs="0"/> </sequence> </extension> </complexContent> </complexType>
Extending an MPEG-7 DS <UsageInformation xsi:type="sf:UsageInformationType"> ... <Publication> <Publisher xsi:type="mpeg7:OrganizationType"> <NameTerm href=“urn:sf:publisher:MSNBC”/> </Publisher> <PublicationLocation> <Country>us</Country> <Region>wa</Region> </PublicationLocation> <PublicationDate> <TimePoint>2000-10-23T14:20:00</TimePoint> </PublicationDate> </Publication> </UsageInformation>
Summary • Quality search depends on quality metadata • MPEG-7 standards ease development costs • Controlled vocabularies • MPEG-7 MDS can be used to interoperate • XML Schema allows controlled extensions
Thank you singingfish.com
MPEG-7 Basics • ISO/IEC 15928 Multimedia Content Description Interface • Comprehensive set of audiovisual description tools. • Enabled by key Internet standards: • W3C: XML, XML Schema • IETF standards: URI, URN, URL for resource naming and location • Harmonized with other emerging metadata standards: • Dublin Core, MPEG-21, NewsML, SMPTE Metadata Dictionary, TV-Anytime, and more. • Text and compressed binary encodings • Both encodings have streaming add, delete, update features for delivery over real-time transports: MPEG-2, MPEG-4, IP, etc. • International Standard in October 2001 • Ballot period begins 14 March 2001
Schema Datatype & Link & media Basic DSs tools structures localization Basic elements • Textual Annotation (free text, structured annotation, syntactic dependency, etc.) • Controlled vocabularies, • Agent, Place, Graph, etc. Time, Duration, Medialocators Basic elements
Creation & production Media Content Usage Content management Content description Structural Conceptual aspects aspects Datatype & Link & media Basic DSs Schema structures localization tools Content Management & Description Title, Creator, Creation location & date, Purpose, Classification, Genre, etc. (Author generated) Format, Coding, Instances, Identification, Transcoding Hint, etc. (Several instances) Rights holder, Access rights, Usage Record, Financial aspects, etc. (Evolution) • Viewpoint of the structure: Segments • Spatial / temporal structure • Audio, video low-level Ds • Elementary semantic information.
Creation & production Media Content Usage Content management Content description Structural Conceptual aspects aspects Datatype & Link & media Basic DSs Schema structures localization tools Content Management & Description (Conceptual aspects) • Viewpoint of conceptual notions • Events, objects, abstract concepts, and their relation
Navigation & Creation & Access production Media Content Usage Summary Content management Content description Variation Structural Conceptual aspects aspects Datatype & Link & media Basic DSs Schema structures localization tools Navigation and Access Efficient support of : discovery,browsing, navigation, visualization / sonification
Navigation & Creation & Access production Media Content Usage Summary Content management Content description Variation Structural Conceptual aspects aspects Datatype & Link & media Basic DSs Schema structures localization tools Navigation and Access Substitution of the original content Adaptation to terminal, network, or user preferences
Model Content organization Navigation & Creation & Access production Media Content Usage Summary Content management Content description Variation Structural Conceptual aspects aspects Datatype & Link & media Basic DSs Schema structures localization tools Content Organization Collection & Classification Description and organization of collection of documents Probability Model Statistical functions and structures to describe sample of AV content and classes of descriptors. Analytic model: Definition of cluster, classes and models to associate a semantic label to a set of data.
Analytic Model Content organization Navigation & Creation & Access production Media Content Usage User preferences User preferences Summary Content management Content description Variation Structural Conceptual aspects aspects Datatype & Link & media Basic DSs Schema structures localization tools User Interaction Collection & Classification User identification and preferences: Filtering, search and browsing User Interaction Usage History
MPEG-7 DDL • XML Schema • Data type extensions • MIME type, ISO country, region, currency codes • ISO Character set codes • Revised time data types to support arbitrary fractional seconds denominator for per-frame positioning • 2001-05-01T15:23:46N11F30(11th frame @ 30 FPS) • Type-centric approach using root abstract types • Control available global elements • Allow extension via name spaces and <extension> mechanism
Basic Derivation of MPEG-7 Types <complexType name="Mpeg7RootType" abstract="true"> <complexContent> <restriction base="anyType"/> </complexContent> </complexType> <complexType name="DSType" abstract="true"> <complexContent> <extension base="mpeg7:Mpeg7RootType"> <sequence> <element name="Header" type="mpeg7:HeaderType" minOccurs="0" maxOccurs="unbounded"/> </sequence> <attribute name="id" type="ID" use="optional"/> </extension> </complexContent> </complexType>
Creation Description Scheme <complexType name="CreationType"> <complexContent> <extension base="mpeg7:DSType"> <sequence> <element name="Title" type="mpeg7:TitleType maxOccurs="unbounded"/> … <element name="Creator“ type="mpeg7:CreatorType“ minOccurs="0" maxOccurs="unbounded"/> … </sequence> </extension> </complexContent> </complexType>