140 likes | 223 Views
Sampling profiler for Rotor as part of optimizing compilation system. Sofia Chilingarova, St-Petersburg, Russia. Prof. Vladimir O. Safonov St-Petersburg, Russia. Agenda. Problem Statement Rotor Sampling Profiler Implementation Results. Problem Statement.
E N D
Sampling profiler for Rotor as part of optimizing compilation system Sofia Chilingarova, St-Petersburg, Russia Prof. Vladimir O. Safonov St-Petersburg, Russia
Agenda • Problem Statement • Rotor Sampling Profiler Implementation • Results
Problem Statement • Rotor does not implement optimizations in JIT-compiler • To implement optimizations runtime profiling is needed • Sampling based profiler:the best option, rather full information by low cost
Typical Optimizing Dynamic Compilation Subsystem Architecture IL (bytecode, CIL) Base Compiler/ Interpreter Executable Code Multilevel Optimizing Compiler Profiling Data Compilation Queue Profiler Data Controller Methods list Profiling plan
Rotor Sampling Profiler Implementation • Goals • Profiling Subsystem Architecture • Data Storage Structure • Self-Tuning • Integration withRotor
Goals • To estimate individual method calls frequency • To construct a Call Graph • To achieve a reasonably low cost: • small total overhead of profiling • avoid suspending user threads for a long time • To make good use of existing Rotor facilities
Profiling Subsystem Architecture SSCLI Threads Profiler marks managed threads buffer Profiler Marking- Thread local queue raw samples data Manager-Thread Global queue Data Storage
Data Storage Structureprevious approaches a bunch of samples DCG (Dynamic Call Graph) PCCT (Partial Call Context Tree)
Self-Tuning • When taking a sample: if “visited” frame is encountered, stack lookup is completed • The sample is marked with a “visited” mark • When processing samples: if marked sample contains only 1 frame data (a topmost frame), a special “repetitions” counter is incremented • Profiling interval is tuned based on “repetitions” counter value when a fixed number of samples is processed
Integration withRotor • Threads are stopped at “safe points” to get profile • Just as they are stopped for GC or debugging • Inherent SSCLI “Stack Walk” mechanism is used to collect managed stack samples • Internal SSCLI VM hash tables and synchronization locks are used to store and maintain profile data
Results: testing environment • Tests from Rotor test suit have been used: sscli\tests\bcl\threadsafety • Many threads execute the same code • Measures used: • statistical correlation of total individual method calls counters • Arnold & Ryder’s Tree Overlap Percentage Measure • Self-tuning turned off for simplicity of measurement • But the best results were obtained with the same interval, which had been set automatically (100ms) • Average value from 10 subsequent runs is counted
Questions Author: Sofia Chilingarova, e-mail: sofie-chil@hotmail.ru