MultimediaN

MultimediaN cooperation and common interests with Freeband Communication Jan Biemond Mediamatics Dept., TU Delft

Outline • MN: Main themes, partners, projects, project interrelations • Common interest: I-Share, AWARENESS, . . . • Relevant MN projects: A TU-Delft perspective

Four MN themes 4. Knowledge discovery, Content enrichment 1.Multimedia analysis, fusion 2.Multimodal interaction 3. Multimedia system

the MultimediaN partnership Partnership • UvA, CWI, TUD, TNO, TI, CTIT, Philips • IBM, LogicaCMG, V2, Waag, UU, VU, NFI, Beeld&Geluid, Ilse, DEN, DBNL,Van Dale, Compano, Thales, Roesink,NOC, Paradiso, NS, RTLi, Police, SPSS, eMAXX, Abstract Computing • International science reputation to guarantee competitive edge • Track record in co-operation and co-research with industry programs • Track record in knowledge chain spin-off, demonstrators, test beds

Some figures • 16 MEuro project • Duration 2004-2008 • # Ph-D students • # post docs • # software engineers 60

N9 A-PILOTS MN Projects Integrating projects N5 Sem. Access N6 Persona-lisation N7 VAYF █ e-Culture N1 Feature learning N2 Inter-action N3 Data-bases M Software Engineering & Exchange █i-Services Security █ Media █ Fundamental projects Application pilots Bureau

N9 A-PILOTS Project interrelations Integrating projects N5 Sem. Access N6 Persona-lisation N7 VAYF █ e-Culture N1 Feature learning N2 Inter-action N3 Data-bases M Software Engineering & Exchange █i-Services Security █ Media █ Fundamental projects Application pilots Philips internal link Bureau

Faculty of Electrical Engineering, Mathematics and Computer Science 1. Telecom 2. Software Technology 3. Micro- electronics 4. Electrical Power Eng. 5.Mediamatics 6. Applied Mathematics P2P systems, protocols Distr. Processing streaming media I-share MultimediaN Willingness to share

Project N1: Features, learning Theo Gevers (UvA), Richard Heusdens (TUD)

Speech enhancement • Problem: Acoustical noise in mobile, digital voice communications systems: • human-to-human communication (e.g., mobile telephony) • human-to-machine communication (e.g., automatic booking services, intelligent interfaces) • Solution: • Speech enhancement as a preprocessing step, i.e. before the speech enters the speech codec/recognizer • Current enhancement algorithms yield good SNR improvement, but poor intelligibility • R&D activity: • Time-varying analysis of speech and non-stationary noise modelling • Using a priori knowledge of speech production and auditory perception

Project 2: InteractionAdelbert Bronkhorst (TNO), Maja Pantic (TUD)

Problem • Human-human interaction • Facial expression, intonation, body language, touch, . . . • Human-computer interaction • Clicking mouse • Touching keyboard Multiple modalities: Sight, sound, touch, … Extremely context sensitive! Single modality, Context-insensitive!

Goal • Multimodal context-sensitive interaction • Easy and natural interaction between humans and computers through anticipating, intelligent interfaces • In concreto: • A listening and observing interface • wearables sensing physiological data • a “mixed reality” interface for distant collaboration

Visual processing Context-sensitive interpretation Audio processing Context-sensitive responding Tactile processing Intelligent interfaces Who this user is What his task How he feels

R&D activities • Sensing of • Gestures, direction of observation, facial expression • Who is speaking, intonation, style of speaking, … • Motion, position • Integration, interpretation and information transportation • State of mind, stress, context, .. • Representations of data, objects, other users, ...

Project N7: Video at your Fingertips Alan Hanjalić, TUD, Jan Nesvadba Philips Research

 Still needed:Video processing for content extraction ! News report on topic T Lecture part on X Suspicious behavior Problem • Basic “infrastructure” available for digital processing of video: Compression, streaming, editing, display adaptivity, . . . Processed digital video Digital video 1000100110001 0010010101011 1100110011110 10111010101 1000100110001 0010010101011 1100110011110 10111010101 Algorithm

Video Content Analysis (VCA)Data  Features  Semantics • - Color composition • - Shape and texture characteristics • Camera and object motion • intensity / direction • Speech and audio signal properties • … • Features (Signal/data properties) • “Meaning” of data/signal ? Semantic Gap • - News report on Euro • Car chase through NY • Dialog between A and B • An interview • Happiness • Romance • …

VCA development so far • Content extraction possibilities • Shot-boundary detection (1991 - ) • Keyframe-based video abstracts (1992 - ) • Multimodal information fusion (1995 - ) • High-level parsing (1997 - ) • Automatic annotation (indexing) (1999 - ) • Affective VCA (2001 - ) • Performance not satisfactory • Low precision and recall of extracted content (heuristics!) • Algorithmic issues insufficiently treated • High algorithmic complexity and long run-time • Context models missing

Goal • To developVCA technology • suited for practical use* in consumer, business, • education and professional applications * High reliability, low complexity

R&D Activities • Robust, unconstrained face detection/recognition • Human body motion analysis • Video Content Management: Parsing, pruning, abstracting, summarization and classification • Security: Smart camera’s: persons entering the scene, tracking purposes • Media: Consumer home video applications: linking persons with names and context • Surveillance:suspicious behavior and aggression detection • 2. Personal health care: revalidation at home Media: Concert video browser

N9: Application pilot: Security Multimodal fusion of sound and video for • Aggression detection • Patient observation

N9: Application pilot: Concert-video browser • VCM of registrations of pop concerts at Paradiso through Fabchannel • Modeling the experience • Boundary selection of semantically coherent temporal segments • Classifying instrumentals solo’s, vocals, group singing, instruments [… ApplauseVocalSpeech instrument …]

Discussion • MultimediaN just started! • The right moment for interaction! • Contact: • Arnold Smeulders [smeulder@science.uva.nl] • Jan Biemond [J.Biemond@ewi.tudelft.nl]

MultimediaN

MultimediaN

Presentation Transcript

MultimediaN Application Pilot E-Culture