40 likes | 127 Views
ITR Collaborative: Compressed Search and Retrieval for Very Large Text and Image Repositories. Amar Mukherjee School of Electrical Engineering and Computer Science University of Central Florida Don Adjeroh Computer Science and Electrical Engineering West Virginia University, Morgantown,
E N D
ITR Collaborative: Compressed Search and Retrieval for Very Large Text and Image Repositories Amar Mukherjee School of Electrical Engineering and Computer Science University of Central Florida Don Adjeroh Computer Science and Electrical Engineering West Virginia University, Morgantown, Tim Bell Department of Computer Science University of Canterbury, New Zealand Award #s: IIS-0312724, IIS-0312484 Duration: 09/01/2003 – 08/31/2006 October 2004
compressed output input sequence BWT MTF VLC BWT output Objectives, Approach & Broader Impact • Research Objectives • Search & retrieval for compressed text • Search & retrieval forlossless compressed images • Search-aware compression • Approach • keep data compressed for as much as possible • The Broader Impact • Explosive growth of text and image data • Efficient search & retrieval for text and image repositories
Significant Results • Text Part • Algorithms for approximate matching on BWT text • QGREP-DFA • Searching on LZW-compressed text • Improved LZW algorithm for compressed text retrieval • MLZW (2-pass; random access; partial decoding) • Image Part • Searching on context-based predictive-coded images • search on JPEG-LS, CALIC, L-JPEG • search-aware predictive image coding • BWT-based compressed shape matching • 2D-BWT • for compression • for image search
Publications • Publications from the Project • N Zhang, M Mukherjee, D Adjeroh and T Bell, “Approximate pattern matching using the Burrows Wheeler Transform”, Proc. IEEE Data Compression Conference (2003), p. 458 • Nan Zhang, Tao Tao, Ravi Vijaya Satya, & Amar Mukherjee, "Modified LZW Algorithm for Efficient Compressed Text Retrieval", Proc. Int’l Conf. on Info. Tech.: Coding and Computing, (2004), p. 224. • Tao Tao & Amar Mukherjee, "LZW Based Compressed Pattern Matching", Proc. IEEE Data Compression Conference (2004), p. 568. • Tao Tao, & Amar Mukherjee, "Compressed Pattern Matching for Predictive Lossless Image Encoding", Proc. International Conference on Distributed Multimedia Systems, (2004), p. 120. • N Zhang, A Mukherjee, D Adjeroh, & T Bell, “Pattern matching on BWT text: Inexact pattern matching”, (manuscript, to be submitted) • Related Project • NSF IDM: Compressed Domain Search for Text and Images by Sorted Contexts, 2002-2005 (Same PIs)