410 likes | 435 Views
ROSE Compiler Framework. What is ROSE , How does it Work, and Why does it add Value? May 8, 2017 V1.7. Ever Wonder…. Are there bugs in our code, will it work correctly? Are there trap doors and extra things we don’t know about? Is our code susceptible to compromise?
E N D
ROSE Compiler Framework What is ROSE , How does it Work, and Why does it add Value? May 8, 2017 V1.7
Ever Wonder….. • Are there bugs in our code, will it work correctly? • Are there trap doors and extra things we don’t know about? • Is our code susceptible to compromise? • Is the code’s performance optimized, will it work on the new platforms? • What exactly is in our little black box’s firmware that controls most everything?
What is ROSE? The Quick Answer • ROSE is to software what a Find and Replace function is to a Word Processor. Except ROSE is much more. • Find and Replace allows a Word Processor user to quickly Find syntax (words or phrases) and optionally Replace them with other syntax (words or phrases).
Find and Replace Value • Find and Replace adds value by automating document editing, eliminating mistakes of manual editing, and increasing productivity. • For example having a form letter that can be personalized to many clients instead of having to be rewritten from scratch each time or edited manually.
Sometimes Analyzing Syntax Can Help Us Has saved me from sending embarrassing emails numerous times. On the other hand, pig face, jerk, and nitwit sailed through without raising a single chili pepper, as did turkey and damn Syntax: the arrangement of words and phrases to create well-formed sentences in a language.
Syntax versus Semantics • A Find and Replace function would need to understand word meaning and context to interpret meaning. • Native language has basic classifications of syntax: • Nouns • Pronouns • Verbs • Adjectives • Adverbs • Prepositions • Conjunctions Semantics: the meaning of the syntax in a language.
Remember Reed-Kellogg Sentence Graphing? But we still do not understand the meaning of the words, just their classification.
Sentences can Be Syntactically Correct and Not Make Sense • She fed the dream to my absence. • Beware of Buffalo buffalo, buffalo, for they may buffalo you. • Colorless green ideas sleep furiously These sentences are all syntactically correct !
Differences Between Natural Languages and Computer Languages • A syntax ambiguity in a natural language (such as English) may cause the reader to pause to understand the intention of the author. • Computer languages have very strict syntax rules • A syntax mistake in a computer language will cause the compiler to issue an error, warning, or worse yet create code which is not correct (a bug). Equivalence Assignment
Language Translators Understandable to a French Speaker Understandable to an English Speaker Language Translator Will it rain today? • pleuvra-t-ilaujourd’hui?”
C++ Source Code Compiler Calculate the value of n!, where n is an positive integer, from 1 to 8. Example: 6! = 1x2x3x4x5x6= 720 Understandable to a Computer - Assembler Understandable to an Software Engineer - C++ C++ Compiler
Compiler Definition • a computer program that translates a program written in a high-level language into another language, usually machine language.
ROSE – What is Does • ROSE, being a compiler, has the ability to understand both syntax and semantic information contained in computer languages such as C, C++, Java, PHP, Python, OpenMP, and FORTRAN. • It does this creating an Abstract Syntax Tree (AST) which is analogous to the Reed-Kellogg sentence graph.
ROSE – What It Does • So ROSE is much more powerful than a Find and Replace function and a sentence graph. • ROSE is a Find, Understand, and Rewrite function. • ROSE allows users to create tools to interrogate the syntactical and semantic structure of source code, using the ROSE simplified intermediate representation (IR)
Rose Flow Diagram Stops the compiling process to let the user code perform analysis User Analysis Simplified IR Source Code Outputs rewritten Source code
ROSE Secret Sauces Simplified IR Source Code Output Binary Analysis Dan Quinlan Creator of ROSE
If Word Processors Had ROSE,Could Impose Tactfulness Harshness Reduction Algorithm Input Sentence Output Sentence ROSE Framework “I've noticed you've had trouble getting to work on time. What can I do to help?” “If you are late again you are fired”
If Word Processors Had ROSE, Could Impose Simplicity Frankness Algorithm Input Sentence Output Sentence “Thank you for trusting me with some of your responsibilities. I'm sorry that I can't help you this time because of my workload. Is there anything I could help you with next week, when I have more time?” ROSE Framework “No.”
If Word Processors Had ROSE, Could Impose Common Sense Input Paragraph Output Sentence This SOFTWARE PRODUCT is provided by COLOSELTRON SOFTWARE "as is" and "with all faults." COLOSELTRON SOFTWARE makes no representations or warranties of any kind concerning the safety, suitability, lack of viruses, inaccuracies, typographical errors, or other harmful components of this SOFTWARE PRODUCT. There are inherent dangers in the use of any software, and you are solely responsible for determining whether this SOFTWARE PRODUCT is compatible with your equipment and other software installed on your equipment. You are also solely responsible for the protection of your equipment and backup of your data, and COLOSELTRON SOFTWARE will not be liable for any damages you may suffer in connection with using, modifying, or distributing this SOFTWARE PRODUCT. Legalese Interpretation Algorithm ROSE Framework There are bugs in this code.
If Word Processors Had ROSE, Could Impose Directness Hemingway Style Algorithm Tonight I would like to toast our new Software Director who we are so fortunate to welcome to ACME SOFTWARE. At her former company, COLOSULTRON SOFTWARE, she was able to roll out agile methodologies and completely rewrite their software disclaimer to avoid litigation. ROSE Framework “I drink to make other people more interesting.” Ernest Hemingway
ROSE Framework Operates on Computer Code Code to Operate on Syntax and Semantics (IR) Input Source Code Output Source Code Source Code: C, C++, Java, Fortran, Python, PHP, OpenMP or Binary Source Code Output is unique to ROSE ROSE Framework Transformed Source Code or Report Source or Binary Code Binary code (or Machine Language) is what compilers (and then Assemblers) turn source code into so computers can understand it. Also known as Firmware when inside little black boxes.
ROSE Built Tools Optimize Code for Platforms MPI Transform Algorithm Optimize Code For Multiple Threads in a Processor Input Output ROSE Framework Open MP Transform Algorithm MPI Optimized Physics Code Existing Physics Code Input Output Optimize Code Across Multiple Processors ROSE Framework Open MP Optimized Physics Code Existing Physics Code
ROSE Built Tools Optimize Code for Performance Mesh Transform Algorithm Optimize Code For Different Solver Types Input Output ROSE Framework Solver Transform Algorithm Mesh Optimized Physics Code Existing Physics Code Input Output Optimize Code Across Different Mesh Types ROSE Framework Solver Optimized Physics Code Existing Physics Code
ROSE Built Tools Can Translate or Visualize Code Code Translation Algorithm Code Visualization Input Output Visualization Transformation C,C++,C# ROSE Framework COBOL, Ada, Jovial Input Output Verify Translated Source Code Correctness ROSE Framework Code Visualization Reports Source or Binary Code
ROSE Built Tools Find and Repair Potential Vulnerabilities Vulnerability Detection Checkers And Patches Vulnerability Detection Checkers Input Report Input Report Repaired Binary or Source Code ROSE Framework Code Potential Vulnerabilities Report ROSE Framework Source or Binary Code Source or Binary Code Detect and Repair Code Potential Vulnerabilities Detection of Code Potential Vulnerabilities
ROSE Built Tools Correctness and Bug Seeding Code Thorn Seed Vulnerabilities to Measure Tool Effectiveness Input Output LTL Specification Language Proof of Correctness Report ROSE Framework Vulnerability Seeding Transform C Code Specification to Code Correctness Output Input The RERS Challenge July 12, 2017 ROSE Framework Seeded Test Code C, C++Test Code
ROSE Built Tools Analyze Binary Code Binary Code Analysis Detection Algorithm Back Door, Obfuscation, Dead code Big 5, Changes, Input Condition Enforcement Stubs ROSE Framework Binary Code Analysis Existing Binary Code Input Output ROSE Framework Instrumented Source Code Units Source Code Output Unit Tester
ROSE and Clang/LLVM • We use Clang (the high level IR) as the frontend for OpenCL support and can use it alternatively as the frontend for C language support (if not too many GNU extensions are required). • We can generate LLVM (the low level IR), and we have another contract with Rice that was just started to make that more robust and address Fortran 77. • We can generate LLVM IR from binaries (using the instruction semantics). • We support the same plugin mechanism as CLANG/LLVM to add passes over the AST. • We support the LLVM compiler as a backend within source to source work (specifically ROSE can emulate the LLVM compiler (by version number) and handle the LLVM specific C and C++ language extensions outside of the C and C++ standard). • ROSE can be compiled using LLVM and it regularly tested with LLVM in our ROSE Matrix Testing.
1) ROSE Helping LLVM Produce Source Codes Existing Clang/LLVM Compiler Input Source Code Binary Code Clang IR Clang Parser LLVM IR Existing ROSE Compiler Source Code ROSE IR EDG IR EDG Parser Source to Source Transform
2) LLVM Helping ROSE Generate Binaries Existing Clang/LLVM Compiler Binary Code Clang Parser Clang IR LLVM IR Existing ROSE Compiler Input Source Code Source Code ROSE IR EDG Parser EDG IR ROSE to LLVM Transform
3) ROSE Helping LLVM Analyze Binary Codes Existing Clang/LLVM Compiler Binary Code Clang Parser Clang IR LLVM IR Binary to LLVM IR Transform Existing ROSE Compiler Input Binary Code Rose Disassembler ROSE IR
4) LLVM and ROSE Share the Same Plug In API for IR Analysis Existing Clang/LLVM Compiler Binary Code Clang Parser LLVM IR Clang IR Analyzer or Checker Existing ROSE Compiler Input Source Code Source Code ROSE IR EDG Parser EDG IR
5) ROSE can Use LLVM as a Back End Compiler Existing Clang/LLVM Compiler Binary Code Clang Parser Clang IR LLVM IR Existing ROSE Compiler Input Source Coder Source Code ROSE IR EDG Parser EDG IR Source Transform
6) ROSE Source Code can be Compiled by Clang/LLVM Existing Clang/LLVM Compiler Binary Code Clang IR Clang Parser LLVM IR Existing ROSE Compiler ROSE Source
ROSE Built Tools Versus Commercial Tools • Commercial tools generally work well within their markets • Commercial tool makers generally supply tools for large markets • Windows • Java, C++, C, C# (current versions) • Browser Based • Commercial tool makers take six months or more to add new capabilities (i.e. C++11, C++14, C++17) • Commercial tools depend largely on a priori data • User community aware of commercial tool capabilities/shortcomings
ROSE Built Tools • Extends to areas that Commercial tools miss • Latest C++, C, Java standards (v11,v14, future) • Fortran, Ada, Binaries • ROSE built tools quickly adaptable to detect new threats • Can produce source code • Can add capability to commercial tools • ROSE built tools tailored to specific needs of users
ROSE Summary • Adds Value by • Automating Source and Binary Analysis • Transforming Source code to new Source Code • Addressing areas not covered by commercial tools • Enhancing existing commercial tools as an add on • Expandable to future specialized analysis and translation requirements