280 likes | 463 Views
GCC ICI (Interactive Compilation Interface). Grigori Fursin. ALCHEMY Group INRIA Futurs France. January, 2007. Funded by HiPEAC network. Outline. Introduction and Motivation Iterative Interactive Compiler Framework Interactive Compilation Interface (ICI) Tools and Experiments
E N D
GCC ICI (Interactive Compilation Interface) Grigori Fursin ALCHEMY Group INRIA Futurs France January, 2007 Funded by HiPEAC network
Outline • Introduction and Motivation • Iterative Interactive Compiler Framework • Interactive Compilation Interface (ICI) • Tools and Experiments • Conclusions and Future Work
Motivation • Current compilers fail to deliver best performance on modern processors due to • rapidly evolving hardware • simplistic hardware models • fixed black-box optimization heuristics • inability to fine-tune applications • lack of run-time information • Different research compilers or transformation tools • rewritten from scratch to “clean” internals and understand behavior (time consuming) • have many unnecessary duplications of other compiler internals • are often incompatible with each other and non-portable • usually support limited number of languages • still often have ambiguous and non-portable optimization heuristics
Goals • Instead of developing new compiler or transformations tools, modify current popular (non-research) rigid compilers into simpler transparent open transformation toolsets with externally tunable optimization heuristics through a standardized Interactive Compilation Interface (ICI) • Control only decision process at global and local levels and avoid revealing all intermediate compiler representation to allow further transparent compiler evolution • Narrow down optimization space by suggesting only legal transformations • Enable iterative recompilation algorithm to apply sequences of transformations • Treat current optimization heuristic as a black-box and progressively adapt it to a given program and given architecture • Allow life-long, whole-program optimization research with optimization knowledge reuse
Current Compilers Source-to-source transformers Application Decision for Perform transformation 1 transf 1 Compiler optimization heuristic Sub-heuristic 2 Sub-heuristic 1 Sub-heuristic j Sub-heuristic i Sub-heuristic k Decision for Perform transformation i transf i Binary-to-binary transformers Binary
Iterative Interactive Compiler Framework Application Iterative Interactive Compiler Rigid compiler optimization heuristic “black box” Decision for transformation 1 Perform transf. 1 ICI1 Decision for transformation 2 Perform transf. 2 ICI2 Decision for transformation i Perform transf. i ICIi External compiler drivers Binary Program Optimization Database
Interactive Compilation Interface Application … Iterative Interactive Compiler Analysis, decision and parameters for decision for optimization Apply transformation … Executable
Interactive Compilation Interface Application … Iterative Interactive Compiler External output transformation file Analysis, decision and parameters for decision for optimization Saved decisions and parameters for transformations Apply transformation Write mode … Executable
Interactive Compilation Interface Application External input transformation file or Socket Communication External output transformation file or Socket Communication … Iterative Interactive Compiler Analysis, decision and parameters for decision for optimization Saved decisions and parameters for transformations Modified decisions and parameters for transformations Apply transformation Write mode Read mode … Read/Write mode Executable
Interactive Compilation Interface • Invoking ICI • Through command line: • Write mode: • gcc -fici-generate-ftree-loop-linear -funroll-loops *.c • Read/Write mode: • gcc -fici-generate -fici-use-ftree-loop-linear -funroll-loops *.c • Through environment variables • (to enable transparent continuous optimizations): • Write mode: • exportGCC_ICI_GEN = 1 • make • Read/Write mode: • exportGCC_ICI_GEN = 1 • exportGCC_ICI_USE = 1 • export GCC_ICI_OPTS = -ftree-loop-linear -funroll-loops • make
Current Implementation External output transformation xml file: <?xml version="1.0"?> <compiler_ici> <file_name="swim.f"> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>4</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>3</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> … </file_name> </compiler_ici>
Current Implementation • Supported optimizations: • global: • program phase reordering • local: • loop interchange • loop peeling • loop unrolling • more optimizations soon … External output transformation xml file: <?xml version="1.0"?> <compiler_ici> <file_name="swim.f"> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>4</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>3</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> … </file_name> </compiler_ici>
Current Implementation • Supported optimizations: • global: • program phase reordering • local: • loop interchange • loop peeling • loop unrolling • more optimizations soon … External output transformation xml file: <?xml version="1.0"?> <compiler_ici> <file_name="swim.f"> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>4</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>3</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> … </file_name> </compiler_ici> • Based on PathScale ICI (2004-2006) • inlining • array padding (global/local) • loop fusion/fission • loop interchange • loop blocking • loop unrolling • register tiling • prefetching
Iterative Recompilation Algorithm Iterative Recompilation Algorithm to apply sequences of transformations: clear transformation_file_out.xml set PATHSCALE_ICI_W to 1 compile program (write transformation_file_out.xml) set PATHSCALE_ICI_R to 1 _label_recompile: copy transformation_file_out.xml to transformation_file_in.xml modify transformation_file_in.xml if needed compile program (read transformation_file_in.xml, write transformation_file_out.xml) if transformation_file_in.xml not the same as transformation_file_out.xml go to _label_recompile
GCC Instrumentation (Phase Reordering) gcc/passes.c #include “fici.h” void execute_pass_list (…) { … /* GCC ICI */ if (flag_ici_use) { int i; for(i = 0; i < fici_pass_count(type); i++) execute_one_pass(pass_list[fici_reorder_pass_number(type, i)]); } else execute_pass_list(pass, new_type); … do { if (flag_ici_generate) fici_reorder_add_pass(type, pass->name, pass->index); if (execute_one_pass (pass) && pass->sub) execute_pass_list (pass->sub, type); pass = pass->next; } while (pass);
GCC Instrumentation (Transformations) gcc/loop-unroll.c #include “fici.h” static void decide_unrolling_and_peeling (struct loops *loops, int flags) { … decide_unroll_constant_iterations (loop, flags); if (loop->lpt_decision.decision == LPT_NONE) decide_unroll_runtime_iterations (loop, flags); if (loop->lpt_decision.decision == LPT_NONE) decide_unroll_stupid (loop, flags); if (loop->lpt_decision.decision == LPT_NONE) decide_peel_simple (loop, flags); /* GCC ICI */ if (flag_ici_use) fici_unroll_in(get_name(current_function_decl), loop->num, loop->depth, &(loop->lpt_decision.decision),&(loop->lpt_decision.times)); if (flag_ici_generate) fici_unroll_out(get_name(current_function_decl), loop->num, loop->depth, &(loop->lpt_decision.decision), &(loop->lpt_decision.times)); loop = next; }
GCC Instrumentation (Features) gcc/tree-loop-linear.c #include “fici.h” linear_transform_loops (struct loops *loops) { … if (flag_api_generate) { dump_file_tmp=dump_file; dump_flags_tmp=dump_flags; dump_file=fici_features_group_start_out(FICI_FGR_LOOP_DEPS); dump_flags=TDF_DETAILS | TDF_STATS; fapi2_features_start_dump_tmp(); } … } Reuse GCC dump information and progressively clean it
Using Framework Porting from PathScale Continuous Optimization Framework (2003-cur) to GCC or developing: • Continuous iterative optimization driver with run-time adaptation at function, loop-levelor instruction level using low-overhead phase detection technique • Driver to continuously collect all possible optimization parameters • Driver to automatically and continuously rebuild compiler optimization heuristic, and adapt to a specific architecture using statistical methods and collective optimization knowledge reuse among different programs and architectures • Prototype framework to replace a model-based compiler heuristic with automatically learned one by connecting ICI with WEKA - an open-source machine learning software package
Iterative Continuous Optimizations application source-to-source transformations current compilers binary execution binary-to-binary transformations
Iterative Continuous Adaptive Optimizations application source-to-source transformations Iterative Interactive Compiler Program Transformation Database binary execution Iterative Optimizations/ Machine Learning binary-to-binary transformations
ML to Remove Compiler Heuristic transformations GCC ICI application1 features execution time … Building Model with WEKA transformations GCC ICI applicationN features execution time
ML to Remove Compiler Heuristic transformations GCC ICI application1 features execution time … Building Model with WEKA transformations GCC ICI applicationN features execution time transformations new application GCC ICI features
ML to Remove Compiler Heuristic transformations GCC ICI application1 features execution time … Building Model with WEKA transformations GCC ICI applicationN features execution time transformations new application GCC ICI features GCC ICI
ML to Remove Compiler Heuristic transformations GCC ICI application1 features execution time … Building Model with WEKA transformations GCC ICI applicationN features execution time transformations new application GCC ICI features best execution time GCC ICI
Conclusions • We demonstrate a simple, practical and non-intrusive way to turn current • rigid compilers into powerful interactive transformation toolset with an Interactive • Compilation Interface that allows to bias compiler optimization decisions externally • We avoid the pitfalls of rigidifying the compiler internals while granting • access to rich-enough features to take performance-critical decisions • We considerably reduce optimization search space by analyzing and applying only legal transformations • We develop tools for continuous collective life-long optimizations and knowledge reuse across different programs and architectures • We use framework in EU projects to automatically adapt and optimize programs for performance, code size, power consumption, multiple ISA, etc
Future work • Porting ICI to GCC in collaboration with IBM, NXP (Philips), STMicro, ARC, multiple universities within HiPEAC network of excellence and within EU-funded projects MilePost, SARC and GGCC • Adding more transformations and enabling phase-reordering at function level in GCC • Unifying optimization naming conventions to enable portability and knowledge reuse to build optimization heuristics automatically • Implementing run-time adaptation technique to select different program versions at run-time depending on program behavior • Finishing framework for practical continuous life-long whole-program optimizations with statistical or machine learning techniques • Porting ICI to JIT compilers (Jikes, .NET) to unify run-time optimizations • Would like to participate?http://sourceforge.net/projects/gcc-ici
Questions? Software development web-site for GCC ICI: http://sourceforge.net/projects/gcc-ici Thanks to Sebastian Pop, Cupertino Miranda and Hamid Daoud for help with gcc modifications Collaborations and Support: IBM, NXP (Philips), STMicro, ARC, CAPS, Universities within HiPEAC This work is funded by HiPEAC http://www.hipeac.net Contact e-mail: grigori.fursin@inria.fr More information: http://fursin.net/research_desc.html