70 likes | 86 Views
Balancing query expressive power with compression efficiency for real-time, one-pass stream querying. Utilizes semantic compression techniques to improve querying of compressed tables and data streams. Research project funded by NASA.
E N D
NASA SpaceCommunications Symposium Semantic Data Compression Techniques for Mobile Computing and Stream Data Principal Investigators: G Ozsoyoglu, Z.M. Ozsoyoglu Task Number: NAG3-2578 Case Western Reserve University September 18, 2002
Querying Compressed Tables: Designing compression-aware query languagesCompromise between query expressive power and compression efficiency Querying Compressed Data Streams: Real-time, one-pass-only stream querying and compression efficiency Semantic Data Compression • Project Overview Start Date: 8/1/2001 End date: 3/31/2003
Enterprise Relevance: Table and stream data occur frequently in computer networks, distributed mobile networks, and telecommunication networks such as the Earth Science Enterprise, Space Science Enterprise, Mars Network, and Space-Based Internets of NASA. Compression and querying of stream data is directly applicable to NASA projects. Impact: Databases will be compressed on a “query-need” basis. Query engines will be aware of the compression employed and perform efficient querying. Semantic Data Compression • Enterprise Relevance and Impact
Task Title Placed Here • Milestones - Technical Accomplishments and Schedules
A large number of compression techniques. Syntactic compression: Compress byte strings. Semantic Compression: Employ data semantics in approximating data; Answer queries with a guaranteed upper bound on the error of approximation. Representative tuples and outliers (row-wise relationships) Classification and regression trees (column-wise rel.s) Employ attribute domain information.
Given a compressed database DB and query Q, Evaluate Q on DB without decompressing DB; decompress output.Best for existing query engines; low compression ratio. By first decompressing selected relations/columns.Cost: Rewriting tables before Q evaluation. By decompressing tuple components (selectively) during query evaluation.Cost: On a per-query basis.Requires query engine changes, fast random decompression. Algebraic Laws: Commutativity Op(DeCmp(T)) =? DeCmp(Op(T))
This is a research initiation project with a two-year funding of $35,869. There are no funding issues. Semantic Data Compression • Funding Issues