790 likes | 1.03k Views
Cross-media Intelligent Searching in Digital Library. Yueting Zhuang Zhejiang University, China Nov. 18, 2006, Egypt. Outline. 1. CADAL: China digital library 2. Our Vision to next generation of digital library 3. From Multimedia Retrieval to Cross-media Retrieval
E N D
Cross-media Intelligent Searching in Digital Library Yueting Zhuang Zhejiang University, China Nov. 18, 2006, Egypt
Outline 1. CADAL: China digital library 2. Our Vision to next generation of digital library 3. From Multimedia Retrieval to Cross-media Retrieval 4. Retrieval of Chinese calligraphy character: a cross-media practice 5. Building Personalized Portal 6. Conclusion ICUDL06, YT Zhuang
Outline 1. CADAL: China digital library 2. Our Vision to next generation of digital library 3. From Multimedia Retrieval to Cross-media Retrieval 4. Retrieval of Chinese calligraphy character: a cross-media practice 5. Building Personalized Portal 6. Conclusion ICUDL06, YT Zhuang
3rd Workshop 2004, CMU, USA ICUDL06, YT Zhuang
ICUDL06, YT Zhuang ICUDL 2005, Zhejiang University, China
1. CADAL: China Digital Library • China-US One Million Book Digital Library Project • a unique library resource to scholars, students, and citizens • contain over one million scanned books • A big step towards the goal: create a universal free to read digital library • Get knowledge available on the web, anytime, anyone, anywhere http://www.cadal.zju.edu.cn ICUDL06, YT Zhuang
As of today, CADAL has achieved: • 1.023 million books was digitized, including: • Degree dissertation • Modern Chinese books • Traditional cultural resources • English books • Supporting multimedia resource: • Image • audio • video • 3D model • Chinese calligraphy • about 200,000 clicks a day (http://www.cadal.zju.edu.cn) • users spread over 70 countries and regions • 16 scanning centers in China, occupying more than 2000 square meters ICUDL06, YT Zhuang
Scanning books Processing digitized books ICUDL06, YT Zhuang
长春 北京 西安 南京 上海 成都 杭州 武汉 广州 ICUDL06, YT Zhuang
Users spread over 70 countries and regions ICUDL06, YT Zhuang
Service structure of CADAL: ICUDL06, YT Zhuang
Current services provided by CADAL: (1) Metadata searching • digital resources are classified into 8 classes according to the publication time and type. • both unified and advanced search are provided for all resources ICUDL06, YT Zhuang
(2) Unified search ICUDL06, YT Zhuang
China Ancient Choose the types of resources to search ICUDL06, YT Zhuang
search results contain each type of resources. ICUDL06, YT Zhuang
(3) advanced search Users can choose search scope, combined results and result style Second search, full texts and detailed information are available in result page. ICUDL06, YT Zhuang
(4) full-text search Full text search uses the texts from OCR ICUDL06, YT Zhuang
Outline 1. CADAL: China digital library 2. Our Vision to next generation of digital library 3. From Multimedia Retrieval to Cross-media Retrieval 4. Retrieval of Chinese calligraphy character: a cross-media practice 5. Building Personalized Portal 6. Conclusion ICUDL06, YT Zhuang
2. Our Vision to Next Generation of Digital Library • support multimodal sources • enable cross-media retrieval • typical features of existing DLs: • books are indexed by title, author, keywords… • users query books by keywords input • mostly only text information is returned • multimodal data is not fully-supported • What the next generation of DL looks like? ICUDL06, YT Zhuang
Extension to the concept of “Book” • The key of our vision to next generation of digital library is the extension of “book” concept • A book is regarded as not only the written symbols on papers, but also any type of multimedia “item”, such as • A video clip • An audio clip • A piece of painting • ……. ICUDL06, YT Zhuang
…… Scenery Image Chinese Calligraphy Video fragment Audio clips feature analysis knowledge mining a general data representation for multimodal data So in the next generation of DL, “book” can be in “multimodal”: • We can find a general data structure to represent multimodal “books” ICUDL06, YT Zhuang
real world digital world texts multimodal image audio video …… Supporting multimodal data is an important trend in multimedia retrieval: ? We get multimodal information from real world, then can we get multimodal data from digital world, especial like a digital library? ICUDL06, YT Zhuang
Cross-media retrieval • After the extension of “Book” concept, the retrieval shall also be extended. • We call it “cross-media retrieval”. ICUDL06, YT Zhuang
Cross-media - Cross-media - Cross-media - Scenario: a simple example of cross-media : “Giant Panda” Image Starting Query Starting Query Textual Description to the giant Panda: the Panda is a kind of cat which …… Starting Query “Giant Panda” Text “Giant Panda” Audio User can start a queryfrom any type of media, and relevant multimedia data would be returned. ICUDL06, YT Zhuang
available available …… …… texts image audio available available video Cross-media retrieval is a useful way to access multimodal data: • Cross-media retrieval can be regarded as the simulation of the real world, and it helps us get multimodal data in a more flexible and more informative way! ICUDL06, YT Zhuang
Submit a query example user query interface texts image cross-media search engine cross-media search engine cross-media search engine audio knowledge base video raw data query results: texts, images, audios… multimodal representation & index What cross-media retrieval needs to do? It can be an image, audio or keywords… ICUDL06, YT Zhuang
Outline 1. CADAL: China digital library 2. Our Vision to next generation of digital library 3. From Multimedia Retrieval to Cross-media Retrieval 4. Retrieval of Chinese calligraphy character: a cross-media practice 5. Building Personalized Portal 6. Conclusion ICUDL06, YT Zhuang
3. From Multimedia Retrieval to Cross-media Retrieval 1) Image Retrieval: Content-based ICUDL06, YT Zhuang
query example relevance feedback Searching images negative example positive example ICUDL06, YT Zhuang
multimedia retrieval (2) Image retrieval: text-based Query text ICUDL06, YT Zhuang
multimedia retrieval (3) Motion retrieval Given a query example of motion data, we can find similar motion data from database. ICUDL06, YT Zhuang
audio query example audio depository content-based audio search engine adjust query center adjust feature weight return submit user judge returned audio results relevance feedback user • multimedia retrieval (4) Audio retrieval: Content-based System Framework ICUDL06, YT Zhuang
multimedia retrieval audio retrieval: key techniques • extract auditory features in compression field from audio clips • cluster fuzzy auditory features • represent audio clips with the cluster center • retrieve similar audios by cluster center matching • introduce relevance feedback techniques ICUDL06, YT Zhuang
multimedia retrieval audio retrieval: an example feature weight query example weight adjusting relevance feedback ICUDL06, YT Zhuang
multimedia retrieval (5) video retrieval: Overview • unlike text resources, video is unstructured. • rich in visual contents; • poor in semantic understanding; • the challenging issues: • summarization & structuring; • video mining ICUDL06, YT Zhuang
multimedia retrieval (5) video retrieval: key techniques • video structuring: • construct video table-of-content (VTOC) • make it physically structured. • video summarization: • help the user quickly grasp the content of video clips • support video browsing • video encoding/compression ICUDL06, YT Zhuang
video structuring video stream video concept clustering table of contents Scene scene construction group grouping shot boundary detection shot temporal features key frame spatial features Key Frame Extraction ICUDL06, YT Zhuang
video summary: video content mining original video (redundant) video content mining summarized video (concise and informative) ICUDL06, YT Zhuang Find meaningful patterns to support efficient video browsing
video summary: an example two news video are separated in 6 video shots (the following are the key frames) . And their total length is 3 minutes ICUDL06, YT Zhuang
After video summarization, the video is 3 seconds. And it consists of 3 key frames as below. ICUDL06, YT Zhuang
original video similar video shots are clustered together video shot video shot clustering result ICUDL06, YT Zhuang
Video Retrieval video browse ICUDL06, YT Zhuang
video browse summary key frames ICUDL06, YT Zhuang
multimedia retrieval (6) 3D model retrieval: overview measure 3D model with shape similarity ICUDL06, YT Zhuang
multimedia retrieval (6) 3D model retrieval: an example query example ICUDL06, YT Zhuang
As shown above, the multimedia retrieval is generally content-based X retrieval—CBXR. ICUDL06, YT Zhuang
intelligent integration • towards cross-media Retrieval • Motivation image retrieval audio retrieval video retrieval Cross-media retrieval …… motion retrieval 3D model retrieval CBXR We can provide a more flexible and efficient way to access multimodal data. We name it as cross-media retrieval. ICUDL06, YT Zhuang
Support multimodal sources • smooth integration of multimodal data; • query media objects by examples of different modalities; • Challenging issues: • texts, images, audios, etc. are represented with different features • different features are heterogeneous • cross-media similarity can’t be measured by content features • there is a semantic gap between low-level features and semantics ICUDL06, YT Zhuang