Video retrieval and User interaction and digital rights management

Video retrieval and User interaction and digital rights management From Multimedia Retrieval, Springer, Blanken et al.

“Multimodal” is the keyword… • Based on a case study • Formula race cars video recordings • Fusion of multimodal information • Sound • Audio signal analysis to detect interesting events – when the commentator gets excited • At the beginning of an event, there is an overview by commentator • They capture the audio signal and screen out the non-voice range signal • They also look for specific words – not general voice recognition, but searching only for a handful of race-specific words

Fusion • Audio • Analysis of image stream • To catch start of race and other events • Used to locate time boundaries of isolatable events • Superimposed text • Projected on tv screen • Information on the driver • Driver’s place in race, etc.

Audio processing • Mix of human language, car noise, background noise, crowd cheering, horns • Look for human voice frequency • Short time energy (STE) • To remove noise • Wave form based • Pitch – fundamental frequency (F0), the higher, the more excitement in the voice • Search for phonemes • Pause rate – to detect quantity of speech • Keyword spotting – less semantics, but lower error rate

Image stream • Searched for places where commentator raised his voice • Searched histogram, looking for certain colors and shapes • Tracked the changing of colors and shapes over a series of frames • Focus on • Start of race • Passing • Fly-outs (sand and dust)

Text • Two classes • Scene text • Superimposed text • The same text can span many frames, and so they count on its position being fixed to limit processing time

Interaction • Ways to pose queries • Ways to give feedback • Ways to explore

Interaction types • Retrieval • Query formulation • Concept based • Content based • Concept-based • Key words in natural language • People use different words for the same thing • Metadata is often missing • Easy for user, hard for software • Content-based • Query by example paradigm • User provides examples

Dynamic query interaction • Sliders, buttons, etc. • Visual is the key • Of the query • Of the results • Example system, page 299 • Interaction cycle is short

Browsing • Links, with a feeling similar to using the web • Browsing model • To get impression of search space • To find something when you aren’t sure what it is • Browsing a collection of objects and browsing a single object • Browsing keywords or namespace hierarchy • Example on page 301

User input and relevance feedback • Modalities • Visual, audio, tactile • Or touch screen, electronic pen, camera, mic, eye tracker, locality sensor, mouse, keyboard • No user guide needed • If it is speech only, it is difficult to process • Multiple modalities at once • Such as speech and a map for location or distance • Use of ambient intelligence to collect information • Relevance feedback • Binary feedback • Weighed relevance feedback – image page 305 • Personalization • Similar to 1-to-1 marketing concept • User profiles are used • Users not excited about providing profile info, though • Users are grouped into content interest groups

Feedback • Passive works well, like skipping songs on a feed • Making an offer that adds to a query, works sometimes, like Amazon trying to sell you similar books • User profiles can be built automatically from a history of purchases or a clickstream • Filtering techniques • Content based – based on triples • Attribute – value – fit • Title – war and peace – 0 • Social based – by putting people into groups and getting larger user samples and putting profiles into groups

Presentation • Must provide metadata and data in an integrated way • Inherently multimedia in nature in query and response • Tree maps or complex metadata or data • Graphs to put multimedia objects together into single conceptual objects • Starfield display • Breaking videos into segments to aid non-linear searching • Providing sample frame for each segment • Images on pages 314 and 315 and 316 • Key factors in presenting multimedia data – content adaption • What capabilities the device has • Limits of device – like size, color, formats of data • Must often change formats of data to fit a device

Digital rights • DRM (digital rights management) • Preventative approach • Encryption • Node locking • Dongle • Reactive approach • Embedding extra information in the product • Tracking behavior and looking for a violation • Sometimes called forensic tracking • Looking for specific watermarks, often specific to a given user • Makes it hard to pass content on • Application domains • Legal – concept: Personal Entertainment Domain (PED) • To keep content secure, commercially and intelligence-wise • Diagram on page 325 and 326 and 331 • Sometimes the media is free and commercials are embedded

Video retrieval and User interaction and digital rights management

Video retrieval and User interaction and digital rights management

Presentation Transcript

Digital Rights Management

Image and Video Retrieval

Digital Rights Management

Privacy and Digital Rights Management

Digital Rights Management:

Predicting the evolution of digital rights, digital objects, and digital rights management languages.

Permissions and User Rights

Information Retrieval Interaction

Digital Video and User Interface

DIGITAL RIGHTS MANAGEMENT

Digital Rights Management and Trusted Computing

NEPS User Rights Management

Video Retrieval

Image and Video Retrieval

Video Retrieval

Digital Rights Management

User Interaction

Digital Rights Management and Technological Tying