1 / 32

Towards a Realistic Scheduling Model

Explore a comprehensive discussion on implementing parallel systems, taskgraph scheduling techniques, inter-processor communication, and the importance of abstraction in scheduling static computations. Dive into scheduling algorithms, communication bottlenecks, and classic communication models.

rstamey
Download Presentation

Towards a Realistic Scheduling Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a Realistic Scheduling Model Oliver Sinnen, Leonel Sousa, Frode Eika Sandnes IEEE TPDS, Vol. 17, No. 3, pp. 263-275, 2006.

  2. Parallel processing is the oldest discipline in computer science – yet the general problem is far from solved

  3. Why is parallel processing difficult? • ”Jo flere kokker jo mere søl” • Partitioning and transforming problems • Load balancing • Inter-processor communication • Granularity • Architecture

  4. Implementing parallel systems • Manually • MPI • PVM • Linda • Automatically • Parallelising compilers (Fortran) • Static scheduling

  5. Taskgraph scheduling:Representing static computations

  6. Modelling computations A=B+C Data dependencies C B A Valid sequences: CBA, BCA Invalid sequences: ABC, ACB, CAB, BAC

  7. Another example A = (B-C)/D F = B+G D C B G A F

  8. Scheduling

  9. Static taskgraph scheduling techniques The scheduling process p1 p2 A A c1 c2 Allocation B B C C c4 c5 c3 time D D E E Taskgraph Schedule

  10. Topological sorting • Topological sorting • to order the vertices of a graph such that the precedence constraints are not violated • All valid schedules represent a topological sort • Scheduling algorithms differ in how they topologically sort the graph

  11. The importance of abstraction • Abstraction is important to preserve generality • Too specific float sum = 0; for (int i=0;i<8;i++) { sum += a[i]; } • General and flexible float sum = sumArray(a);

  12. Communication

  13. Communication is a major bottleneck • Typically from 1:50 to 1:10,000 difference between computation and communication • Communication cost not very dependent on data size. • Interconnection network topology affect the overall time.

  14. Scheduling work prior to 1995 • Assumptions • Zero-interprocessor communication costs • Fully connected processor interconnection networks.

  15. Amounts of data transferPublic transport is a good thing?

  16. Data-size not is not major factor • Multiple single messages • Single compound message connect send connect send connect send connect send send send

  17. Interconnection topology

  18. Fully connected

  19. The ring To send something from here.. …to here

  20. Interprocessor communication • Zero vs non-zero communication overheads • Direct links vs connecting nodes P11 P12 P13 P14 P1 P2 P3 P4 P21 P22 P23 P24 Bus P31 P32 P33 P34 RAM P41 P42 P43 P44 Shared memory Bus-based multiprocessor Distributed memory Mesh multiprocessor

  21. Avoiding communication overheads

  22. Duplication 1 a 1 a 1 a duplication 1 1 1 1 1 b 1 c 1 b 1 c allocation allocation p1 p2 p1 p2 a a a t=1 t=1 b b c t=2 t=2 t=3 c

  23. When considering communication overheads

  24. Classic communication model: Assumptions • Local communications have zero communication costs • Communication is conducted by subsystem. • Communication can be performed concurrently • The network is fully connected

  25. Implications • Network contention (not modelled) • Tasks compete for communication resources • Contention can be modelled: • Different types of edges • Switch verticies (in addition to processor verticies)

  26. Processor involvement in communication I Two-sided involvement (TCP/IP PC-cluster)

  27. Processor involvement in communication II One-sided involvement (Shared memory Cray T3E)

  28. Processor involvement in communication III Third party involvement (Dedicated DMA hardware Meiko CS-2)

  29. Problems • All classic scheduling models assume third-party involvement. • Very little hardware are equipped with dedicated hardware supporting third-party involvement. • Estimated finish-times for tasks are hugely inaccurate. • Scheduling algorithm are very sub-optimal.

  30. Even more problems

  31. Results 3TE-900 bobcat Sun E3500

  32. The End

More Related