1 / 38

Linking Syntactic and Semantic Models of Java Source Code within a Program Transformation System

Linking Syntactic and Semantic Models of Java Source Code within a Program Transformation System. V. Winter, J. Guerrero, A. James, C. Reinke. Outline. Introduction Motivation: The need for static analysis Why transformation systems are interesting in this setting Creating a rule in PMD

haviva-ross
Download Presentation

Linking Syntactic and Semantic Models of Java Source Code within a Program Transformation System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Linking Syntactic and Semantic Models of Java Source Code within a Program Transformation System V. Winter, J. Guerrero, A. James, C. Reinke

  2. Outline • Introduction • Motivation: The need for static analysis • Why transformation systems are interesting in this setting • Creating a rule in PMD • Creating a rule in Sextant • GPS-Traverse • Overview • Example: Constructing a call-graph • Technical details of GPS-Traverse

  3. Source-code Analysis • Is heavily employed across the public and private sectors including: • the top 5 commercial banks • 5 of the top 7 computer software companies • 3 of the top 5 commercial aerospace and defense industry leaders • the 3 largest arms services for the US • 3 of the leading 4 accounting firms • 2 of the top 3 insurance companies

  4. Source-Code Analysis • It has been argued that source-code analysis can play an important role with respect to software assurance within an Agile development process • The FDA is recommending (and may eventually mandate) the use of static-analysis tools for the development of medical device software. • GrammaTech’sCodeSonar is a static-analysis tool that the FDA is currently using to investigate failures in recalled medical devices.

  5. Static-Analysis Tools • Are frequently rule-based • Utilize a variety of software models (e.g AST, call-graph, control-flow graph) • In an OO implementation, involve traversals of object-structures using the visitor pattern. • Make use of pattern recognition (e.g., matching). • May transform source-code (e.g., inserting markers/annotations to control analysis) • Query software models • Aggregate information

  6. Avoid using while-loops without curly braces Creating a rule in PMD

  7. Creating A rule in PMD • Step 1: Figure out what to look for. In this case we want to capture the convention that while-loops must use braces. • Construct a compilation unit containing an instance of the syntactic property you want to detect.

  8. AST Generation • PMD uses JavaCC to generate an AST (Abstract Syntax Tree) corresponding to the source code. CompilationUnit TypeDeclaration ClassDeclaration:(package private) UnmodifiedClassDeclaration(Example) ClassBody ClassBodyDeclaration MethodDeclaration:(package private) ResultType MethodDeclarator(bar) FormalParameters Block BlockStatement Statement WhileStatement Expression PrimaryExpression PrimaryPrefix Name:baz Statement StatementExpression:null PrimaryExpression PrimaryPrefix Name:buz.doSomething PrimarySuffix Arguments

  9. Pattern Selection • Select and generalize the smallest portion of the AST containing the pattern in which you are interested. Make sure you discriminate good patterns from bad patterns (e.g., blocks versus no blocks). Consult Java grammar as needed. CompilationUnit TypeDeclaration ClassDeclaration:(package private) UnmodifiedClassDeclaration(Example) ClassBody ClassBodyDeclaration MethodDeclaration:(package private) ResultType MethodDeclarator(bar) FormalParameters Block BlockStatement Statement WhileStatement Expression PrimaryExpression PrimaryPrefix Name:baz Statement StatementExpression:null PrimaryExpression PrimaryPrefix Name:buz.doSomething PrimarySuffix Arguments

  10. Create RULE

  11. Create Pattern Matcher

  12. Add Rule TO RULESET • Add the Newly Created Rule to the PMD ruleset <?xml version="1.0"?> <ruleset name="My custom rules" xmlns="http://pmd.sf.net/ruleset/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://pmd.sf.net/ruleset/1.0.0 http://pmd.sf.net/ruleset_xml_schema.xsd" xsi:noNamespaceSchemaLocation="http://pmd.sf.net/ruleset_xml_schema.xsd"> <rule name="WhileLoopsMustUseBracesRule" message="Avoid using 'while' statements without curly braces" class="WhileLoopsMustUseBracesRule"> <description> Avoid using 'while' statements without using curly braces </description> <priority>3</priority> <example> <![CDATA[ public void doSomething() { while (true) x++; } ]]> </example> </rule> </ruleset>

  13. Avoid using while-loops without curly braces In SextanT

  14. Create BASIC RuLE Pattern strategyWhileLoopsMustUseBracesRule: Statement[:] while( <Expression>_1 ) <Statement>_1 [:]  Statement[:] while( <Expression>_1 ) <Statement>_1 [:]

  15. Add Specific Pattern Constraint strategyWhileLoopsMustUseBracesRule: Statement[:] while( <Expression>_1 ) <Statement>_1 [:]  Statement[:] while( <Expression>_1 ) <Statement>_1 [:] if { not(<Statement>_1 = Statement[:] <Block>_1 [:]) }

  16. Add METRIC/Action strategyWhileLoopsMustUseBracesRule: Statement[:] while( <Expression>_1 ) <Statement>_1 [:]  Statement[:] while( <Expression>_1 ) <Statement>_1 [:] if { not(<Statement>_1 = Statement[:] <Block>_1 [:]) andalsosml.addViolation(<Statement>_1) }

  17. Observations • Primitive operations in transformation systems include: • Parsing • Matching • Traversal • The software models that transformation systems typically operate on are terms – either concrete or abstract syntax trees. • This makes the foundational framework of transformation systems well-suited for rule-based source-code analysis systems. Especially systems whose rules have syntax-based specifications.

  18. Use equals() instead of == to compare objects Semantic Rules

  19. Java’s Integer Cache • Some rules require semantic analysis • The implementation of such rules requires the ability to query semantic models (i.e., software models other than an AST)

  20. Linking Syntactic and Semantic Models within a Transformation System GPS-Traverse

  21. GPS-Traverse • GPS-Traverse • enables contextual information to be transparently tracked during transformation. • is a collection of transformations whose purpose is to associate terms with the contexts in which they are defined • This association is based on: • Structural properties • Nested classes • Local classes • Anonymous classes • Frame variables currently in scope • Generic variables currently in scope

  22. Nested Classes

  23. Fields versus Local Variables

  24. Generic Types versus Standard Types

  25. In Summary… • GPS-Traverse: term  context • In turn, a tuple of the form (term, context) provides the basis for a variety of semantic analysis functions • A particularly useful such analysis function is called resolution

  26. Resolution • Resolution is a semantic analysis function that operates on terms denoting references • The resolution function used by Java is highly complex and involves: • Static evaluation • Type analysis • Overloading, overriding, shadowing • Generic analysis • Local analysis • Visibility – public, protected, package private, private • Subtyping • Imports: single-type, on-demand, and static

  27. Uses of Resolution • Resolution is a prerequisite for a variety of software-based analysis and manipulation activities such as: • Bootstrapping semantic models • Software metrics • API usage analysis • Refactoring • Slicing • Migration – a well-formed compliment of slicing • Join point recognition • Resolution-informed transformation is well-suited for many of these activities • And finally, resolution-informed transformation can also play a key role in the construction of semantic models of software such as the call graph of a software system

  28. Example: Call Graph

  29. Bascinet, the TL System, and Sextant Technical Details

  30. Bascinet • A Netbeans-based IDE supporting the development of TL programs • Syntax-directed editors for TL, ML, and EBNF files • Code-foldingfor both TL and ML • Hyperlinks from MLton compiler output to ML source code • Integrated with third-party visualization tools such as Cytoscape , GraphViz, and TreeMap • Solves some key system-level problems: • Discrete concurrent (forgetful) applicationof a transformation to a file hierarchy { transformation } x {file1, file2, …} • Continuous sequential (stateful) applicationof a transformation to a file hierarchy state1 = transformation( state0, file1) state2= transformation( state1, file2)

  31. The TL System • Input: GLR Parser • Output: Abstract Prettyprinter • TL – A language for specifying higher-order transformation • First-order matching on concrete syntax trees • First-order and higher-order generic traversals • Standard combinators plus special-purpose combinators • Modular • Partially type-checked • ML – A functional programming language tightly integrated with TL • Computation is expressed in terms of modules written in TL and ML.

  32. TL • The terms being manipulated are concrete syntax trees • The computational unit is the conditional rewrite rule: termlhs termrhs if { condition } • Rules (also called strategies) can be bound to identifiers: r: termlhs termrhs if { condition } • Strategies can be constructed by composing rules using a variety of combinators: r1 <+ r2 r1 <; r2 • Strategies can be applied to terms using traversals and iterators: TDL myStrategymyTerm

  33. import_closedGPS.Locator moduleCyclomaticComplexity strategyinitialize: ... strategyoutputResults: ... strategycollectMetrics: TDL( GPS.Locator.enter <; ccAnalysis <; GPS.Locator.exit ) strategyccAnalysis: MethodCC<+ConstructorCC strategyMethodCC: ... strategyConstructorCC: ... end// module

  34. GPS-Traverse • Transformationally maintains a semantic model which can be queried in a variety of ways: • getContextKey • getEnclosingContextKey • currentContextType • enclosingContextType • withinContextType • inMethod • isGeneric • isLocalGeneric • isVar

  35. strategyCallGraph: <SelectorOptExpression>_methodCall <SelectorOptExpression>_methodCall if{ isMethodCall<SelectorOptExpression>_methodCall andalsosml.GPS_inMethod() andalso<key>_methodContext = sml.GPS_getContextKey() // semantic query andalso<key>_calledMethod = sml.resolve( <key>_methodContext ,<SelectorOptExpression>_methodCall) andalsosml.outputPP( <key>_methodContext ) andalsosml.output(" calls ") andalsosml.outputPP( <key>_calledMethod) } strategyisMethodCall: //basic call SelectorOptExpression[:] <TypeArgsOpt>_1 <Id>_1 <Arguments>_1 [:]  SelectorOptExpression[:] <TypeArgsOpt>_1 <Id>_1 <Arguments>_1 [:] <+ // embedded call ...

  36. The End Questions?

More Related