1 / 36

Implement high-level parallel API in JDK

Richard Ning – Enterprise Developer 1 st June 2013. Implement high-level parallel API in JDK. Important Disclaimers. THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.

harley
Download Presentation

Implement high-level parallel API in JDK

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Richard Ning – Enterprise Developer 1st June 2013 Implement high-level parallel API in JDK

  2. Important Disclaimers THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES. ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE. IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE. IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS. 2

  3. Introduction to the speaker Developing enterprise application software since 1999 (C++, Java) Recent work focus: IBM JDK development My contact information: mail: huaningnh@gmail.com 3

  4. What should you get from this talk? By the end of this session, you should be able to: • Understand how parallel computing works on multi-cores • Understand implementation of high-level parallel API in JDK

  5. Agenda 1 Introduction: multi-threading, multi-cores, parallel computing Case study 2 3 Other high-level parallel API Roadmap 4

  6. Introduction Multi-Threading Parallel computing Multi-core computer

  7. Case study Execute the same task for every element in a loop • Use multi-threading for the execution

  8. Can it improve performance?

  9. Multi-threading on computer with one core C P U t1 t1 t1 t2 t2 time

  10. 100% CPU usage with single thread and multi-threading • Can't improve performance • Performance even decreases with extra threading consuming • It is useless to use multi-threading(parallel) API)

  11. Multi-threading on computer with multi-core CPU1 CPU1

  12. Thread runs separately on every core t1 Cor1 t2 Cor2 t3 Cor3 t4 Cor4 time

  13. Raw thread Any improvement? Executor • Disadvantages • Users need to create and manage it • Not flexible – the number of threads is hard to configure flexibly • > core number, resources are consumed in thread context, even decrease performance • < core number, some cores are wasted • No balance, the calculation can't be allocated into every core equally

  14. Separate creation and execution of thread • Use thread pool to reuse thread

  15. A high-level API concurrent_for

  16. The API is easy to use, users only need to input executed task and data range and don't care about how they are executed. However they still have disadvantages. • Task executes an entry once, which isn't sufficient • A task is targeted to a thread, which isn't flexible • The number of thread in thread pool isn't aligned to core number

  17. CPU 4 Core 1 2 3 Core: 4 Thread: n Task: m Thread Pool Overloading: n>>4 n Thread 1 2 3 Not flexible: m >n Tasks m n Task 1 2 3

  18. Core number doesn't align to thread number: Use fixed thread pool CPU 4 Core 1 2 3 Thread Pool 4 Thread 1 2 3 Thread number = core number

  19. Task division: another task division strategy ForkJoinPool 1. Divide big task into small tasks recursively 2. Execute the same operation for every task 3. Join result of every small task Divide and conquer Task1 Task2 Task3 Task5 Task6 Task7 Task4 Fork Join

  20. Better use for divide and conquer problem Task dividing strategy is from users, isn't configured properly according to running condition Previous issues (thread oversubscription and starvation, unbalancing) still exist

  21. New parallel API based on task scheduler

  22. CPU 1 2 3 4 Core Thread Pool 4 1 2 3 Thread 1 6 11 16 Initial status • Tasks are allocated equally, • One thread by one core • Every thread maintains its task queue which consists of affiliated tasks T A S K Q U E UE 2 7 12 17 3 8 13 18 4 9 14 19 5 10 15 20

  23. CPU 2 1 3 4 Core Thread Pool 2 3 4 1 Thread T A S K Q U E UE 2 10 15 3 Unbalancing loading 4 5

  24. CPU 1 2 3 4 Core Thread Pool 4 1 2 3 Thread T A S K Q U E UE 2 10 15 21 Balancing loading by task stealing and adding new tasks 3 4 5 22

  25. Parallel API with new working mechanism - concurrent_for Range: the range of data set [0, n) Strategy: the strategy of dividing range: automatic, static with granularity Task: the task which executes the same operation on range

  26. Other high-level parallel API concurrent_while Can add data set while executing it concurrently. concurrent_reduce Use divide_join based task to return calculation result. concurrentsort Sort data set concurrently. for example, a matrix multiply another matrix int[5][10] matrix1 , int[10][5] matrix2 int[5][5] matrix3 = matrix1 * matrix2 int[5][5] matrix3 = concurrent_multiply(matrix1, matrix2) Math calculation

  27. Anyway we always can achieve performance improvement by parallel computing based on multi-cores.

  28. Roadmap Implement high-level parallel API in JDK based on new task scheduler High performance Correct • Scalable Portable

  29. Review of Objectives Now that you’ve completed this session, you are able to: • Understand what parallel computing is and what is good for • Understand design of new parallel API based on task.

  30. Q & A

  31. Thanks!

More Related