200 likes | 292 Views
Multimedia Databases Storage and Retrieval of Media Data. Jukka Teuhola Dept. of Information Technology, University of Turku Fall 2012. General course info. http://staff.cs.utu.fi/kurssit/multimedia_databases/autumn_2012/ Lectures 28 h, Tue 14-16, Fri 10-12, in classroom B2033
E N D
Multimedia DatabasesStorage and Retrieval of Media Data Jukka Teuhola Dept. of Information Technology, University of Turku Fall 2012 MMDB-1 J. Teuhola 2012
General course info • http://staff.cs.utu.fi/kurssit/multimedia_databases/autumn_2012/ • Lectures 28 h, Tue 14-16, Fri 10-12, in classroom B2033 • Homework 6 times, every solution gives a bonus of 1 point to the final score in the examination. • Examination: • 5 tasks, max 8 points, so the total score is 0-40 points • Minimum accepted = 20 points, giving grade 1 • Linear interpolation: 20-40 1-5, formula: • Preliminary knowledge:Databases; data structures and algorithms MMDB-1 J. Teuhola 2012
Course material • Powerpoint slides: <course homepage>/slides Optional reading: • H. M. Blanken, A. P. de Vries, H. E. Blok, L. Feng (Eds.): Multimedia Retrieval, Springer 2007. • L. Dunckley: Multimedia Databases – An Object-Relational Approach, Addison-Wesley, 2003. • P. Rigaux, M. Scholl, A. Voisard: Spatial Databases, with Application to GIS, Morgan-Kaufmann, 2002. • D. C. Gibbon, Z. Liu: "Introduction to Video Search Engines", Springer 2008. • Miscellaneous articles. MMDB-1 J. Teuhola 2012
Course Contents (tentative) • Introduction • Management of Large Objects • Text and Document Databases • Multidimensional Data Structures • Spatial Databases • Image Databases • Video Databases • Audio Databases • Integration and Standardization MMDB-1 J. Teuhola 2012
Main themes of the course Structural and algorithmic database issues: • Storage principles • Data representation • Queries, searching, content-based retrieval • Indexing NOT: • Usage of software products • MM authoring and content production • MM presentation MMDB-1 J. Teuhola 2012
1. Introduction: ‘Multimedia revolution’ • What is multimedia? A dataset or document containing at least two different media types. • Multimedia and imaging are continuously growing trends. • Enhanced quality and quantity of information, compared to plain text • Brings dramatic improvements to human-computer interaction • Rich and expressive way of representing, browsing and interacting with information • “Second information revolution” • Revolutionizes business, science, engineering, manufacturing, art, entertainment… • Crucial issues: • Size can be huge • Speed required to satisfy audio/video transmission rates • Semantics: both type- and instance-level metadata MMDB-1 J. Teuhola 2012
(1) MM files and archives: Simple browsing and retrieval No queries Supporting software: e.g. web/ media server, browser, player (2)Annotated & indexed archives: Search by keyword; see e.g. Image archive: Gimp-Savvy Audio archive: Spotify Spatial db: Google maps Video archive: YouTube (3) MM archive as part of a wider application Browsing/search by keyword, plus related actions, e.g. Web shop: Amazon (4) ’True’ MM databases: General queries by media content, e.g. Painting search: Hermitage Melody search: Musipedia Levels of sophistication in multimedia databases MMDB-1 J. Teuhola 2012
Multimedia data types (a) Text • Integrated to most multimedia applications; complements (as metadata) non-textual forms of data. • Nowadays text is usually structured/formatted by markup (e.g. XML) • Visual variability through fonts and layout • The most space-effective data type to store (b) Audio • Increasingly popular data type • Different formats (WAV, CD, MP3, AU, AIFF, QT, RA, WMA, Vorbis) • Digitized audio rather space-consuming (tens of Kbytes per second) • Compression is needed (e.g. MP3 compression ratio 12:1) • More compact: synthetic music in MIDI format (Musical Instrument Digital Interface); MPEG-4 SA (Structured Audio) MMDB-1 J. Teuhola 2012
Multimedia data types (cont.) (c) Still raster images • Black-and-white / grey-scale / color • One high-resolution image may take several megabytes • Large number of image formats (GIF, TIFF, JPEG, JP2, PNG, …) • Lossy compression ratio (e.g. for JPEG) normally about 1:10 (d) Vector graphics • 2D or 3D drawings, models, maps • Rather space-effective; consists of larger objects than pixels • Parameters of (meta) objects: scaling, orientation, rotation, etc. • Applications: CAD (computer-aided design), GIS (geographic information systems), animations, computer games (e) Integrated documents (text & images) • Can be generated by today’s text processing programs MMDB-1 J. Teuhola 2012
Multimedia data types(cont.) (f) Digital video • Sequence of frames (= still images) • Highly data-intensive • Integrated audio (interleaved in playback time-sequencing) • Higher compression ratio than with still images (subsequent frames resemble each other). • Compression, transmission and decompression speed must be 20-30 frames per second. • Animations less space-intensive (synthetic images, standard shapes) • MPEG-4: Object-based representation, many special techniques • Streaming formats: ASF (MicroSoft), QuickTime (Apple), RM (RealMedia), FLV (Flash video), WebM (Google) (g) General integrated multimedia/hypermedia presentations • MS Powerpoint, Adobe Flash, SMIL MMDB-1 J. Teuhola 2012
Sample application areas of MMDBs (a) Educational multimedia services: • Distance learning • Teaching material • Educational audio/video document archives • Preview possibility (b) Video-on-demand: • Selection of movie, possibly using queries • Preview possibility; wind/rewind • Requires high bandwidth • Method of payment must be simple MMDB-1 J. Teuhola 2012
Sample application areas (cont.) (c) Audio-on-demand: • Less bandwidth-consuming than video • Recorded programs, music, and live net radio stations (d) Electronic commerce: • Online info about products: pictures, explanations, availability, etc. • Possibility to make queries • Online ordering systems with credit card / net bank payment. • Examples: bookstore, travel agency (e) Intelligent systems (‘expert systems’): • Machine repair: Automatic assistants of different repair jobs.Manuals may be hard to read; demonstrative videos tell it better • Medical care: Standard surgery operations • Crime investigations: combination of surveillance & other info MMDB-1 J. Teuhola 2012
Sample application areas of MMDBs (cont.) (f) Digital libraries • Organized collections of digital information • Both documents and their metadata in digital form • Versatile metadata- and content-based retrieval opportunities • Usually accessible through the web • The web itself & search engines may be considered some kind of (poorly organized) digital library (g) Medical information systems • Patient data, including X-rays, EKG curves, MRI images, ... • Strict confidentiality • Used for diagnosis, monitoring and research • Automated tools: image/signal processing, pattern recognition, ... MMDB-1 J. Teuhola 2012
General observation • All multimedia applications share some common aspects and functions. • The goal of this course is to find the domain-independent set of “core algorithms” which can be used in many applications by varying a few parameters. • A generalized multimedia DBMS (MMDBMS) would be useful; probably as an extension to a standard DBMS. MMDB-1 J. Teuhola 2012
Technology enabling multimedia • Hardware components: High-speed processors (CPU, GPU), high-performance multimedia workstations, scanners, digital cameras, video cameras, high-resolution monitors, touch-screen monitors, high-precision printers and plotters. • High-bandwidth networks (WAN, LAN, mobile), fiber optics, network standards • High-capacity storage devices: hard disks, optical disks and jukeboxes, solid-state & non-volatile memories. • Image/video processing software: Compression (JPEG, MPEG), analysis, filtering, segmenting, feature extraction. • CAD and animation software: 2D and 3D graphics, applications in science, engineering, medicine, computer games, etc. • Pattern recognition (characters, shapes, etc.): E.g. neural networks • Advanced software systems: OO languages, OO databases, operating systems, multithreading, etc. MMDB-1 J. Teuhola 2012
‘Definition’ of MMDB (1) Supports the main types of MM data (2) Can handle a very large number of MM objects (3) Supports high-performance, high-capacity storage management:Hierarchical storage (on-line, near-line, off-line) (4) Offers DB capabilities:Persistence, transactions, concurrency control, recovery from failures, querying with high-level declarative constructs, versioning, integrity constraints, security. (5) Information-retrieval capabilities:Exact-match retrieval, probabilistic (best-match) retrieval, content-based retrieval, ranking of results MMDB-1 J. Teuhola 2012
Features related to MM retrieval Functional considerations: • Interactive querying • Relevance feedback • Query refinement • Automatic feature extraction and indexing • Content- and context-based indexing of different media • Single- and multidimensional indexing Efficiency considerations: • Clustering of media data • Storage organization for large media objects • Optimization of multimedia queries • Replication, parallelism, distribution, scalability MMDB-1 J. Teuhola 2012
Architectural considerations Traditional approach: • Relational or extended relational DBMS, with support for large objects (BLOBs and CLOBs) • Information retrieval module (content-based access of objects) Emerging approach: • ’NoSQL’ databases (’Not only SQL’) • Improved retrieval speed for very large quantities of data. • Restricted update types (mainly append), restricted transaction support (relaxed consistency requirements) Ideal: • Extensible database system with OO capabilities • Support for queries and transactions involving MM objects • Support for complex objects with MM subobjects MMDB-1 J. Teuhola 2012