1 / 22

Operating Systems Should Manage Accelerators Sankaralingam Panneerselvam Michael M. Swift

Operating Systems Should Manage Accelerators Sankaralingam Panneerselvam Michael M. Swift Computer Sciences Department University of Wisconsin, Madison, WI. Accelerators. Wire speed processor. Specialized execution units for common tasks

netis
Download Presentation

Operating Systems Should Manage Accelerators Sankaralingam Panneerselvam Michael M. Swift

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Operating Systems Should Manage Accelerators Sankaralingam Panneerselvam Michael M. Swift Computer Sciences Department University of Wisconsin, Madison, WI HotPar 2012

  2. Accelerators Wire speed processor • Specialized execution units for common tasks • Data parallel tasks, encryption, video encoding, XML parsing, network processing etc. • Task offloaded to accelerators • High performance, energy efficient (or both) APU GPU DySER Crypto accelerator HotPar 2012

  3. Why accelerators? • Moore’s law and Dennard’s scaling • Single core to Multi-core • Dark Silicon: More transistors but limited power budget [Esmaeilzadeh et. al - ISCA ‘11] • Accelerators • Trade-off area for specialized logic • Performance up by 250x and energy efficiency by 500x [Hameed et. al – ISCA ’10] • Heterogeneous architectures HotPar 2012

  4. Heterogeneity everywhere • Focus on diverse heterogeneous systems • Processing elements with different properties in the same system • Our work • How can OS support this architecture? • Abstract heterogeneity • Enable flexible task execution • Multiplex shared accelerators among applications HotPar 2012

  5. Outline • Motivation • Classification of accelerators • Challenges • Our model - Rinnegan • Conclusion HotPar 2012

  6. IBM wire-speed processor CPU cores Accelerator Complex Memory controller HotPar 2012

  7. Classification of accelerators • Accelerators classified based on their accessibility • Acceleration devices • Co-Processor • Asymmetric cores HotPar 2012

  8. Classification 1) Application must deal with different classes of accelerators with different access mechanisms 2) Resources to be shared among multiple applications HotPar 2012

  9. Challenges • Task invocation • Execution on different processing elements • Virtualization • Sharing across multiple applications • Scheduling • Time multiplexing the accelerators HotPar 2012

  10. Task invocation • Current systems decide statically on where to execute the task • Programmer problems where system can help • Data granularity • Power budget limitation • Presence/availability of accelerator AES Crypto accelerator HotPar 2012

  11. Virtualization • Data addressing for accelerators • Accelerators need to understand virtual addresses • OS must provide virtual-address translations • Process preemption after launching a computation • System must be aware of computation completion CPU cores 0x5000 0x4900 0x4800 0x400 0x500 0x200 0x100 0x000 Virtual memory crypt 0x240 crypto Accelerator units XML HotPar 2012

  12. Scheduling • Scheduling needed to prioritize access to shared accelerators • User-mode access to accelerators complicates scheduling decisions • OS cannot interpose on every request HotPar 2012

  13. Outline • Motivation • Classification of Accelerators • Challenges • Our model - Rinnegan • Conclusion HotPar 2012

  14. Rinnegan • Goals • Application leverage the heterogeneity • OS enforces system wide policies • Rinnegan tries to provide support for different class of accelerators • Flexible task execution by Accelerator stub • Accelerator agent for safe multiplexing • Enforcing system policy through Accelerator monitor HotPar 2012

  15. Accelerator stub • Entry point for accelerator invocation and a dispatcher • Abstracts different processing elements • Binding - Selecting the best implementation for the task • Static: during compilation • Early: at process start • Late: at call { Encrypt through AES instructions } { Offload to crypto Accelerator } Stub Process making use of accelerator HotPar 2012

  16. Accelerator agent • Manages an accelerator • Roles of an agent • Virtualize accelerator • Forwarding requests to the accelerator • Implement scheduling decisions • Expose accounting information to the OS Asymmetric core Hardware Device driver Thread migration Resource allocator Accelerator agents Kernel Space Stub Stub Process can communicate directly with accelerator Process trying to offload a parallel task User Space HotPar 2012

  17. Accelerator monitor • Monitors information like utilization of resources • Responsible for system-wide energy and performance goals • Acts as an online modeling tool • Helps in deciding where to schedule the task Accelerator Hardware Accelerator monitor Accelerator agent Kernel Space Stub User Space Process making use of accelerator HotPar 2012

  18. Programming model • Based on task level parallelism • Task creation and execution • Write: different implementations • Launch: similar to function invocation • Fetching result: supporting synchronous/asynchronous invocation • Example: Intel TBB/Cilk application() { task = cryptStub(inputData, outputData); ... waitFor(task); } <TaskHandle> cryptStub(input, output) { if(factor1 == true) output = implmn_AES(input) //or// else if(factor2 == true) output = implmn_CACC(input) } HotPar 2012

  19. Summary CPU cores Accelerator Other kernel components Accelerator monitor Accelerator agent Kernel Space Process Stub User Space Process making use of accelerator Direct communication with accelerator HotPar 2012

  20. Related work • Task invocation based on data granularity • Merge, Harmony • Generating device specific implementation of same task • GPUOcelot • Resource scheduling: Sharing between multiple applications • PTask, Pegasus HotPar 2012

  21. Conclusions • Accelerators common in future systems • Rinnegan embraces accelerators through stub mechanism, agent and monitoring components • Other challenges • Data movement between devices • Power measurement HotPar 2012

  22. Thank You HotPar 2012

More Related