1 / 29

Towards an Application-Aware Multicast Communication Framework for Computational Grids

Towards an Application-Aware Multicast Communication Framework for Computational Grids. M. M AIMOUR , C. P HAM RESO/LIP, UCB Lyon ASIAN'02, Hanoi Dec 5th, 2002. Computational grids. application user. from Dorian Arnold: Netsolve Happenings. The current usage of grids. Mostly

atalo
Download Presentation

Towards an Application-Aware Multicast Communication Framework for Computational Grids

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards an Application-Aware Multicast Communication Framework for Computational Grids M. MAIMOUR, C. PHAM RESO/LIP, UCB Lyon ASIAN'02, Hanoi Dec 5th, 2002

  2. Computational grids application user from Dorian Arnold: Netsolve Happenings

  3. The current usage of grids • Mostly • Database accesses, sharing, replications(DataGrid, Encyclopedia of Life Project…) • Distributed Data Mining (seti@home…) • Data and code transfert, massively parallel job submissions (task-farm computing) • Few • Distributed applications (MPI…) • Interactive applications (DIS, HLA…), remote visualization WHY?

  4. WHY?? People forgot the networking side of grids Gbits/s links do not mean E2E performances! Computing resources and network resources are logically separated WHY? End-to-End performances are not here yet Not scalable! Unable to adapt to new technologies and uses

  5. Visions for a grid FROM DUMB LINKS CONNECTING COMPUTING RESOURCES TO COLLABORATIVE RESOURCES The network can work together with the applications to provide in-network processing functions

  6. active router active router active router Application-AwareInfrastructure on Grids campus/corporate source 100 Base TX core network Gbits/s rate computing center Internet Data Center application-aware component computing center lab cluster

  7. A1 active code A1 active code A2 A2 Data Data Application-Aware Components AAC • Based on programmable active nodes/routers • Customized computationson packets • Standardized execution environment and programming interface

  8. Interoperability with legacy routers APPLI APPLI traditional IP routing AL AL AL AL TCP/UDP TCP/UDP TCP/UDP TCP/UDP IP IP IP IP IP IP similar to tunnelling

  9. Deploying new services • Collective/gather operations • Interest management, filtering (DIS, HLA) • On-the-fly flow adaptation (compression, layering…) for remote displays • Intelligent directory services • Distributed, hierarchical security system • Distributed Logistical Storage • Custom QoS policy

  10. MAX MAX MAX MAX Ex: Collective operationsmax computation if x<a then x=a AAC AAC if x<a then x=a if x<a then x=a AAC

  11. Ex: Wide-area interactive simulations remote display flight traffic generator flow adaptation specific filter specific filter INTERNET GRID specific filter "only very close events" filter human in the loop flight simulator airport simulator

  12. Deploying reliable multipoint data distribution services • For • Database accesses, sharing, replications • Data and code transfert, massively parallel job submissions (task-farm computing) • Distributed applications (MPI…) • Interactive applications (DIS, HLA…) • Desired features • scalable • low latencies

  13. For Database accesses, sharing, replications Data and code transfert, massively parallel job submissions (task-farm computing) Distributed applications (MPI…) Interactive applications (DIS, HLA…) Desired features scalable low latencies without multicast Sender data data data data data data Receiver Receiver Deploying reliable multipoint data distribution services

  14. For Database accesses, sharing, replications Data and code transfert, massively parallel job submissions (task-farm computing) Distributed applications (MPI…) Interactive applications (DIS, HLA…) Desired features scalable low latencies Sender data IP multicast data data data Receiver Receiver Receiver Deploying reliable multipoint data distribution services

  15. DyRAM Protocol with modular services for achieving reliability, scalability and low latencies subcast of repair packets global NACK suppression Early Packet Loss Detection Local Recoveries Dynamic Replier Election Accurate Congestion Control

  16. NACK4 NACK4 data4 NACK4 NACK4 only one NACK is forwarded to the source NACK4 Ex: Global NACKs suppression

  17. data3 data4 NACK4 data5 NACK4 NACK4 NACK4 A NACK is sent by the router NACK4 Ex: Early lostpacket detection The repair latency can be reduced if the lost packet could be requested as soon as possible These NACKs are ignored!

  18. The AAC associated to the source can perform early processing on packets. For instance the DyRAM protocol uses subcast and loss detection services in order to reduce the end-to-end latency. In DyRAM, any recei-ver can be designated as a replier for a loss packet.The election service is performed by the upstream AAC on a per-packet basis. Having dynamic repliers allows for more scalability as caching within routers is not required. An AAC associated to a tail link performs NACK aggregation, subcasting and the election on a per-packet basis of the replier. Deploying reliable multipoint data distribution services computing center campus/corporate source active router active router core network Gbits/s rate active router Internet Data Center application-aware component computing center

  19. Local recovery & replier election 4 receivers/group Local recoveries reduces the end-to-end delay (especially for high loss rates and a large number of receivers). #grp: 6…24 p=0.25

  20. Local recovery & replier election As the group size increases, doing the recoveries from the receivers greatly reduces the bandwidth consumption 48 receivers distributed in g groups  #grp: 2…24

  21. Early Packet Loss Service 4 receivers/group #grp: 6…24 EPLD is very beneficial to DyRAM p=0.25 #grp: 6…24

  22. DyRAM implementation testbed configuration • TAMANOIR active execution environment • Java 1.3.1 and a linux kernel 2.4 • A set of PCs receivers and 2 PC-based routers ( Pentium II 400 MHz 512 KB cache 128MB RAM) • Data packets are 4 KBytes

  23. FTP TAMANOIR S S1 S,@IP data FTP port Tamanoir port UDP IP IP UDP S,@IP data ANEP packet The data path

  24. Cost of Data Packet Services ike resamo resama resamd stan

  25. Cost of Data Packet Services • NACK: 135μs • DP : 20μs if no seq gap, 12ms-17ms otherwise. Only 256μs without timer setting • Repair: 123μs

  26. Cost of Replier Election ike resamo NACK

  27. Cost of Replier Election The election is performed on-the-fly. It depends on the number of downstream links. Costs range from 0.1 to 1ms for 5 to 25 links per router.

  28. Conclusions (1) • Grids can be more than end-host computing resources interconnected with network links • High-bandwidth links is not enough to provide E2E performances for distributed, interactive applications • Application-aware components can be deployed to host high-value services • In-network processing functions can make grids more responsive to applications' needs

  29. Conclusions (2) • The paper shows how an efficient multipoint service can be deployed on an application-aware infrastructure • Simulations and experimentations shows that low latencies can be obtained with the combination and collaboration of light and simple services

More Related