340 likes | 490 Views
A Machine Learning Approach for Thread Mapping on Transactional Memory Applications. Source : 18th International Conference on High Performance Computing Authors : Castro, M.; Goes, L.F.W.; Ribeiro, C.P.; Cole, M.; Cintra, M.; Mehaut, J. 100062512 張光瑜 , 100065801 談得聖. Outline. Introduction
E N D
A Machine Learning Approach for Thread Mapping on Transactional Memory Applications Source: 18th International Conference on High Performance Computing Authors:Castro, M.; Goes, L.F.W.; Ribeiro, C.P.; Cole, M.; Cintra, M.; Mehaut, J. 100062512 張光瑜, 100065801 談得聖
Outline • Introduction • Software Transactional Memory (STM) • Thread Mapping • Machine Learning • The ID3 Algorithm • Result & Conclusion & Future Works
Part 1 Introduction
Transaction • a sequence of instructions that performs a single logical function. • The concept originally used in database systems.
Software Transactional Memory (STM) • Programmer can write the code as transactions. • Use the STM libraries to guarantee each transaction is executed atomically and in isolation regardless of eventual data races.
Cache Memory • When the data in the main memory is used, it’s copied into the cache. • From the application perspective, it can be viewed as a way to share data efficiently.
Thread Mapping • For example, you can put threadsthat communicate often on cores that share some level of cache. • By doing so, the high latency to access the main memory can be avoided.
Thread Mapping (cont’d) • Many strategies exist for mapping. • However, there is no single one that provides good performance for all different applications and platforms.
The Goal • Given an application, predict which mapping strategy is the best.
Part 2 Machine Learning
Machine Learning • Static phase: • The goal is to build up a predictor. • Here, the predictor is a decision tree. • Dynamic phase: • Use the predictor to decide which mapping strategy is going to be used.
Static Phase • Three steps: • Application profiling • Data pre-processing • Learning process
Application Profiling • Features: • Category A: the interaction between the application and the STM system. • Transaction time ratio • Abort ratio • Category B: STM mechanisms • Conflict detection: eager, lazy • Resolution strategy: suicide, backoff
Application Profiling (cont’d) • Features: • Category C: the interaction between the application and the platform • Last Level Cache Miss Ratio • Target Variable T: thread mapping strategies • Linux, Compact, Scatter, Round-Robin
Data Pre-Processing • Since we are building a decision tree, the features must be categorical or discrete. • Features in categories A & C are converted into: • Low (0.0; 0.33) • Medium (0.33; 0.66) • High (0.66l 1.0)
Part 3 The ID3 Algorithm
Using Game-Based Cooperative Learning to Improve Learning Motivation: A Study of Online Game Use in an Operating Systems Course IEEE TRANSACTIONS ON EDUCATION, VOL. 56, NO. 2, MAY 2013
ID3 (Iterative Dichotomiser 3) • Quinlan(1979) • Base on Shannon(1949)的Information theory
ID3 (Iterative Dichotomiser 3) • Information theory:若一事件有k種結果,對應的機率為Pi。則此事件發生後所得到的資訊(Entropy)為: • Example 1: 設 k=4 p1=0.25,p2=0.25,p3=0.25,p4=0.25I=-(.25*log2(.25)*4)=2 • Example 2: 設 k=4 p1=0, p2=0.5, p3=0, p4=0.5I=-(.5*log2(.5)*2)=1
ID3 (Iterative Dichotomiser 3) • Calculate the entropy of every attribute using the data set • Split the set into subsets using the attribute for which entropy is minimum (or, equivalently, information gain is maximum) • Make a decision tree node containing that attribute • Recurse on subsets using remaining attributes
Stop condition: • 如果該群資料的每一筆資料都已經歸類到同一類別。 • 該群資料已經沒有辦法再找到新的屬性來進行節點分割。 • 該群資料已經沒有任何尚未處理的資料。
CrossValidation • Leave-one-out • The accuracies on SMP-24 and SMP-16 were 86% and 72%.
Prediction • The dynamic phase. • Three steps: • The application starts running with default thread mapping scheduling and is profiled during a initial warm-up interval. • Then use the profiled data to decide a mapping strategy. • Change the mapping strategy.
Part 4 Result & Conclusion & Future Works
Result & Conclusion • Improve 11.35% and 18.46% compared to the worst case and 3.21% and 6.37% over Linux strategy. • ML-based approach is within 1% of the oracle performance.
Future work • Increase features to make more accuracies. • Automatically executed in an existing STM system. • Other algorithms like Neural networks or Support Vector Machines.