1 / 30

Overview

Overview. Why multiprocessors? The structure of multiprocessors. Elements of multiprocessors: Processing elements. Memory. Interconnect. Why multiprocessing?. True parallelism: Task level. Data level. May be necessary to meet real-time requirements. print engine. File read,

malina
Download Presentation

Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview • Why multiprocessors? • The structure of multiprocessors. • Elements of multiprocessors: • Processing elements. • Memory. • Interconnect. Overheads for Computers as Components 2e

  2. Why multiprocessing? • True parallelism: • Task level. • Data level. • May be necessary to meet real-time requirements. Overheads for Computers as Components 2e

  3. print engine File read, Rendering, Etc. Multiprocessing and real time • Faster rate processes are isolated on processors. • Specialized memory system as well. • Slower rate processes are shared on a processor (or processor pool). mem CPU mem CPU Overheads for Computers as Components 2e

  4. Heterogeneous multiprocessors • Will often have a heterogeneous structure. • Different types of PEs. • Specialized memory structure. • Specialized interconnect. Overheads for Computers as Components 2e

  5. Multiprocessor system-on-chip • Multiple processors. • CPUs, DSPs, etc. • Hardwired blocks. • Mixed-signal. • Custom memory system. • Lots of software. Overheads for Computers as Components 2e

  6. System-on-chip applications • Sophisticated markets: • High volume. • Demanding performance, power requirements. • Strict price restrictions. • Often standards-driven. • Examples: • Communications. • Multimedia. • Networking. Overheads for Computers as Components 2e

  7. Terminology • PE: processing element. • Interconnection network: may require more than one clock cycle to transfer data. • Message: address+data packet. Overheads for Computers as Components 2e

  8. Shared memory: Message passing: Generic multiprocessor PE PE PE mem mem mem … … PE PE PE Interconnect network Interconnect network mem mem mem … Overheads for Computers as Components 2e

  9. Shared memory vs. message passing • Shared memory and message passing are functionally equivalent. • Different programming models: • Shared memory more like uniprocessor. • Message passing good for streaming. • May have different implementation costs: • Interconnection network. Overheads for Computers as Components 2e

  10. Shared memory implementation • Memory blocks are in address space. • Memory interface sends messages through network to addressed memory block. Overheads for Computers as Components 2e

  11. Message passing implementation • Program provides processor address, data/parameters. • Usually through API. • Packet(s) interface appears as I/O device. • Packet routed through network to interface. • Recipient must decode parameters to determine how to handle the message. Overheads for Computers as Components 2e

  12. Processing element selection • What tasks run on what PEs? • Some tasks may be duplicated (e.g., HDTV motion estimation). • Some processors may run different tasks. • How does the load change? • Static vs. dynamic task allocation. Overheads for Computers as Components 2e

  13. Matching PEs to tasks • Factors: • Word size. • Operand types. • Performance. • Energy/power consumption. • Hardwired function units: • Performance. • Interface. Overheads for Computers as Components 2e

  14. Task allocation • Tasks may be created at: • Design time (video encoder). • Run time (user interface). • Tasks may be assigned to processing elements at: • Design time (predictable load). • Run time (varying load). Overheads for Computers as Components 2e

  15. Memory system design • Uniform vs. heterogeneous memory system. • Power consumption. • Cost. • Programming difficulty. • Caches: • Memory consistency. Overheads for Computers as Components 2e

  16. Parallel memory systems • True concurrency---several memory blocks can operate simultaneously. PE PE PE … Interconnect network mem mem mem … Overheads for Computers as Components 2e

  17. Cache consistency • Problem: caches hide memory updates. • Solution: have caches snoop changes. PE PE cache cache network mem mem Overheads for Computers as Components 2e

  18. Cache consistency and tasks • Traditional scientific computing maps a single task onto multiple PEs. • Embedded computing maps different tasks onto multiple PEs. • May be producer/consumer. • Not all of the memory may need to be consistent. Overheads for Computers as Components 2e

  19. Network topologies • Major choices. • Bus. • Crossbar. • Buffered crossbar. • Mesh. • Application-specific. Overheads for Computers as Components 2e

  20. Bus network • Advantages: • Well-understood. • Easy to program. • Many standards. • Disadvantages: • Contention. • Significant capacitive load. Overheads for Computers as Components 2e

  21. Crossbar • Advantages: • No contention. • Simple design. • Disadvantages: • Not feasible for large numbers of ports. Overheads for Computers as Components 2e

  22. Buffered crossbar • Advantages: • Smaller than crossbar. • Can achieve high utilization. • Disadvantages: • Requires scheduling. Xbar Overheads for Computers as Components 2e

  23. Mesh • Advantages: • Well-understood. • Regular architecture. • Disadvantages: • Poor utilization. Overheads for Computers as Components 2e

  24. Application-specific. • Advantages: • Higher utilization. • Lower power. • Disadvantages: • Must be designed. • Must carefully allocate data. Overheads for Computers as Components 2e

  25. TI OMAP OMAP 5910: • Targets communications, multimedia. • Multiprocessor with DSP, RISC. C55x DSP MPU interface bridge MMU I/O System DMA control Memory ctrl ARM9 Overheads for Computers as Components 2e

  26. RTOS for multiprocessors • Issues: • Multiprocessor communication primitives. • Scheduling policies. • Task scheduling is considerably harder with true concurrency. Overheads for Computers as Components 2e

  27. Distributed system performance • Longest-path algorithms don’t work under preemption. • Several algorithms unroll the schedule to the length of the least common multiple of the periods: • produces a very long schedule; • doesn’t work for non-fixed periods. • Schedules based on upper bounds may give inaccurate results. Overheads for Computers as Components 2e

  28. Data dependencies help • P3 cannot preempt both P1 and P2. • P1 cannot preempt P2. P1 P3 P2 Overheads for Computers as Components 2e

  29. Preemptive execution hurts • Worst combination of events for P5’s response time: • P2 of higher priority • P2 initiated before P4 • causes P5 to wait for P2 and P3. • Independent tasks can interfere—can’t use longest path algorithms. P1 P2 P3 P5 P4 M1 M2 M3 Overheads for Computers as Components 2e

  30. Period shifting example t1 t2 t3 process CPU time P1 30 P2 10 P3 30 P4 20 taskperiod t1 150 t2 70 t3 110 • P2 delayed on CPU 1; data dependency delays P3; priority delays P4. Worst-case t3 delay is 80, not 50. P1 P2 P4 P3 P1 P2 P2 CPU 1 P3 P4 P3 P4 CPU 2 Overheads for Computers as Components 2e

More Related