110 likes | 122 Views
Wei-Hsin Lee June 2008. Shared-Dictionary Compression over HTTP (SDCH) . Why do we care?. Speeding up Google and the Web The faster the Web is, the more useful it is. The faster Google web search is, the more searches people do.
E N D
Wei-Hsin Lee June 2008 Shared-Dictionary Compression over HTTP (SDCH)
Why do we care? • Speeding up Google and the Web • The faster the Web is, the more useful it is. • The faster Google web search is, the more searches people do. • Lots of users still suffer from slow networks. For example, in developing countries.
Reduce transmission time • Reducing payload size is the key. • Gzip works well as the compression for each individual response. • What about common data shared by a group of pages (inter-response redundancy) or pages that change a little bit frequently? • Only transmit the data that is common to each response once. • Thereafter, send only the parts of the response that differ.
Why not RFC 3229? • RFC3229 “Delta Compression in HTTP” • Good for saving bandwidth • But • Too many states for server to track • The possible states of www.google.com/search is bigger than all possible search results. • Only applicable to the same URL • Discourages aggressive caching. • No benefit for similar pages that don’t share an URL.
Shared-Dictionary Compression over HTTP (SDCH) • An addition to HTTP • Small set of states (dictionaries) shared between client and server. • Dictionaries are scoped by domain name and path. Just like cookies. It allows dictionaries to apply to multiple URLs.
SDCH protocol details • SDCH defines • How client informs server of its capability and state. • How the server should respond to client when the client is SDCH capable. • How dictionaries get loaded into client. • Implement VCDIFF (RFC 3284) differential compression format with enhancements • Interleave instructions with data so that each network packet can be decoded as it arrives. (chunked encoding) • Checksum to ensure data integrity
Other details • Complement to Gzip or Deflate. • Should be used before applying Gzip • Lab result • About 40 percent data reduction better than Gzip alone on Google search. • See faster Google search results. Especially under low bandwidth and high latency condition. • Working on the best way to get this out to users.
Your help counts! • Please join the group • http://groups.google.com/group/SDCH • Protocol spec, and the encoder/decoder code will be there soon. • Getting your hands dirty is even better! • Make your web site use SDCH. • Make Squid or Apache web servers SDCH capable.
Don’t forget to join the group. http://groups.google.com/group/SDCH