1 / 14

Progress Report

Progress Report. 2013/11/07. Outline. Further studies about heterogeneous multiprocessing other than ARM Cache miss issue Discussion on task scheduling. Manufacturers Other than ARM. Qualcomm aSMP (Asynchronous Symmetrical Multi-Processing) Krait:

julius
Download Presentation

Progress Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Progress Report 2013/11/07

  2. Outline • Further studies about heterogeneous multiprocessing other than ARM • Cache miss issue • Discussion on task scheduling

  3. Manufacturers Other than ARM • Qualcomm • aSMP(Asynchronous Symmetrical Multi-Processing) • Krait: • Per-core DCVS (Dynamic Clock and Voltage Scaling). • Core that is not being used can be completely collapsed independently. • Reduce the need for hypervisors or more complex software management of disparate cores.

  4. Manufacturers Other than ARM • Nvidia • vSMP(Variable Symmetric Multiprocessing) • Tegra 3 • 4highperformanceCortexA9mainprocessor+1energy-efficientCortexA9 Companion processor. • Cannotactivecompanionprocessorandmainprocessorsimultaneously. • Mainprocessors have to use the same frequency.

  5. HSA Foundation

  6. Cache Miss Issue • “For each switching between big(A15) and A7(LITTLE), the L2 cache is cleaned, thus cause memory access overhead.”

  7. Cache Miss Issue • Unless a chip(All A15 or All A7) is shutdown, clean L2 cache for each switching between A15 and A7 is weird. A15 A7 A15 A7 A7 A15 A7 A15 L1 L1 L1 L1 L1 L1 L1 L1 L2 L2

  8. Task Scheduling • Take loading of each task into consideration. • For a given task, assume it behavior: • Computation Ops: n time units. • Memory Ops: 1 time unit. • Different core frequencies cause different loadings. • F = 1, loading = n/(n+1) • F= 2, loading = n/(n+2) • F= 4, loading = n/(n+4)

  9. Single Core • For a given set of tasks and their behaviors, find the minimum frequency such the loading of the core is 100%. • Lower frequency: loading = 100%, but the performance decrease. • Higher frequency: loading < 100%, consume more (dynamic) power.

  10. Scheduling on HMP • According to the core capability, assign processes in the runqueue to core. • Each core apply DVFS/DCVS individually. • However, this does not apply for big.LITTLE. • Each (pair of) core is homogeneous.

  11. Big.LITTLE core Scheduling • Assume that we have n pairs of big.LITTLE cores. • Initially all pairs use LITTLE core. • Assume we know the following information of a task Tk. • Task deadline. • Estimated execution time on big core. • Estimated execution time on LITTLE core.

  12. Heuristic Mentioned Last Time • First, we define “urgency” U to indicate the priority of a task. • For Task Tk • 0<Uk ≦1, then task Tkcan be finished before deadline on LITTLE core • Uk > 1, then task Tkcan’t be finished before deadline on LITTLE core.

  13. Core Switching • Switch one LITTLE core to big core if there exists a task Tk with urgency Uk > 1. • Find all the Tasks {Tj ,with Uj> 0.8}, assign these tasks to big cores. • Switch big cores to LITTLE cores if there is no task with urgencygreater than 0.8.

  14. Discussion

More Related