1 / 25

Massively Distributed Database Systems Broadcasting - Data on air

Massively Distributed Database Systems Broadcasting - Data on air. Spring 2014 Ki- Joune Li http://isel.cs.pusan.ac.kr/~lik Pusan National University. Why Broadcasting? . Simple Data Access Pattern: mostly asymmetric Scalability – Very adequate for massively distributed environments

shlomo
Download Presentation

Massively Distributed Database Systems Broadcasting - Data on air

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Massively Distributed Database SystemsBroadcasting - Data on air Spring 2014 Ki-Joune Li http://isel.cs.pusan.ac.kr/~lik Pusan National University

  2. Why Broadcasting? • Simple • Data Access Pattern: mostly asymmetric • Scalability – Very adequate for massively distributed environments • Example • DMB • TPEG

  3. TPEG – Transport Protocol Experts Group Broadcasting traffic information protocol

  4. TPEG – Message format

  5. TPEG Service Contents Example

  6. TPEG Service

  7. Air Update – Map Data Update

  8. Basic Idea – Broadcast Disks

  9. Key papers and documents S. Acharya, et al. “Broadcast Disks: Data Management for Asymmetric Communication Environments”, ACM SIGMOD 1996, pp.199-210 T. Imielinkski, S. Viswanathan, and B.R. Badrinath,“Data on Air: Organization and Access”, IEEE TKDE Vol.9 No.3, 1997, pp.353-372 J. Xu et al. “Energy Efficient Indexing for Quering Location Dependent Data in Mobile Broadcasting Environments, ICDE 2003, pp.239-250 B. Zheng et al. “Spatial Queries in Wireless Broadcast Systems”, Wireless Network, Vol.10, pp.723-736, 2004 tisa.org, TPEG, http://www.tisa.org/assets/Uploads/Public/TISA14001TPEGWhatisitallabout2014.pdf

  10. Paper #1 – Broadcasting disks in SIGMOD 1995

  11. Key Ideas • Broadcasting as a disk • How to organize broadcast message • Flat Message as a disk • Message with different frequencies as multiple disks • Two Issues • How to organize message – Server Side • How to maintain cache – Client Side

  12. Message Format Flat format Skewed format Multiple disks format Given three data items A, B, and C to broadcast with different access probability,

  13. Performance Measures • What is the goal? • To minimize the average waiting time (expected delay) • Example

  14. Message Formatting Method - Server Relative frequencies F(T1)=1, F(T2)=2, F(T3)=4 LCM=4 minor cycles Major Cycle=S*LCM Length(T3)/LCM=2 • Algorithm • 1. Sort and classify pages by access probability • 2. Determine relative frequency of each disk (page) • 3. Partition each disk into a set of chunks • 4. Define the message format with multiple disks • Example • 4 pages/cycle

  15. Caching Policy at Client • Replacement Policy • Not LRU • Point 1Caching hottest page – problematic.If a page is considered as a hottest page by server, then frequent broadcasting, and therefore caching is not really necessary • Point 2Server’s policy is to minimize the average delay!= Local Demands

  16. Caching Policy at Client • For a given item A, we need to consider • Broadcasting frequency (X) and • Local access probability (P) • Replacement in terms of • PIX (P/X) instead of LRU

  17. Paper #2 – Organization and Access, TKDE 9(3), 1997

  18. Key Ideas • Disk Access – Disk Access Time • Two different measures • Latency and • Energy Consumption • Data Access Time in Data on Air • Tuning Time: Amount of time spent by a client listening to the channel  Power Consumption • Latency: Time elapsed from the time that a client requests data to the point of completing data downloads • Tuning time + Latency  Data Access Time

  19. Broadcast data format Bucket Bucket ID idxptr . . . Bcastptr Bucket type bcast • Without Index, we need a full scanning of a bcast • Issue • How to organize and Where to place Index • For reducing tuning time and latency

  20. Data Access Index 4. Read data . . . . . . 3. Wait until data bucket arrives 2. Wait until the index arrives 1. Client joins here

  21. Where to place Index No Index Single Index (1,m) Index  What’s the difference? Probably (1,m) may improve the performance

  22. How to organize Full duplication vs. Relevant Duplication

  23. No replication

  24. Entire Path Replication

  25. Distributed Index

More Related