230 likes | 419 Views
Program Transformation. Program Representation. Fundamental issue in re-engineering Provides means to generate abstractions Provides input to a computational model for analyzing and reasoning about programs Provides means for translation and normalization of programs. Key questions.
E N D
Program Representation • Fundamental issue in re-engineering • Provides means to generate abstractions • Provides input to a computational model for analyzing and reasoning about programs • Provides means for translation and normalization of programs COSC6431
Key questions • What are the strengths and weaknesses of various representations of programs? • What levels of abstraction are useful? COSC6431
Abstract Syntax Trees • A translation of the source text in terms of operands and operators • Omits superficial details, such as comments, whitespace • All necessary information to generate further abstractions is maintained COSC6431
AST production • Four necessary elements to produce an AST: • Lexical analyzer (turn input strings into tokens) • Grammar (turn tokens into a parse tree) • Domain Model (defines the nodes and arcs allowable in the AST) • Linker (annotates the AST with global information, e.g. data types, scoping etc.) COSC6431
AST example • Input string: 1 + /* two */ 2 • Parse Tree: • AST (withoutglobal info) + 1 2 Add arg1 arg2 int int 1 2 COSC6431
Program Transformation • A program is a structured object with semantics • Structure allows us to transform a program • Semantics allow us to compare programs and decide on the validity of transformations COSC6431
Program Transformation • The act of changing one program into another (from a source language to a target language) • Used in many areas of software engineering: • Compiler construction • Software visualization • Documentation generation • Automatic software renovation COSC6431
Application examples • Converting to a new language dialect • Migrating from a procedural language to an object-oriented one, e.g. C to C++ • Adding code comments • Requirement upgrading, e.g. using 4 digits for years instead of 2 (Y2K) • Structural improvements, e.g. changing GOTOs to control structures • Pretty printing COSC6431
Simple program transformation • Modify all arithmetic expressions to reduce the number of parentheses using the formula: (a+b)*c = a*c + b*cx := (2+5)*3becomesx := 2*3 + 5*3 COSC6431
Transformation tools • There are many transformation tools • Program-Transformation.org lists about 90 of them • Most are based on term rewriting • Other solutions use functional programming, lambda calculus, etc. COSC6431
Term rewriting • The process of simplifying symbolic expressions (terms) by means of a Rewrite System, i.e. a set of Rewrite Rules. • A Rewrite Rule is of the formlhs rhswhere lhs and rhs are term patterns COSC6431
Example Rewrite System 0 + x x s(x) + y s(x + y) (x + y) + z x + (y + z) Under these rewrite rules, the term ((s(s(a)) + s(b)) + c) will be rewritten as s(s(s(a + (b + c)))) COSC6431
TXL • A generalized source-to-source translation system • Uses a context-free grammar to describe the structures to be transformed • Rule specification uses a by-example style • Has been used to process billions of lines of code for Y2K purposes COSC6431
TXL programs • TXL programs consist of two parts: • Grammar for the input language • Transformation Rules • Let’s look at some examples… COSC6431
% Part I. Syntax specification define program [expression] end define define expression [term] | [expression] [addop] [term] end define define term [primary] | [term] [mulop] [primary] end define define primary [number] | ( [expression] ) end define define addop '+ | '- end define define mulop '* | '/ end define Calculator.Txl - Grammar COSC6431
% Part 2. Transformation rules rule main replace [expression] E [expression] construct NewE [expression] E [resolveAddition] [resolveSubtraction] [resolveMultiplication] [resolveDivision] [resolveParentheses] where not NewE [= E] by NewE end rule rule resolveAddition replace [expression] N1 [number] + N2 [number] by N1 [+ N2] end rule rule resolveSubtraction … rule resolveMultiplication … rule resolveDivision … rule resolveParentheses replace [primary] ( N [number] ) by N end rule Calculator.Txl - Rules COSC6431
% Form the dot product of two vectors, % e.g., (1 2 3).(3 2 1) => 10 define program ( [repeat number] ) . ( [repeat number] ) | [number] end define rule main replace [program] ( V1 [repeat number] ) . ( V2 [repeat number] ) construct Zero [number] 0 by Zero [addDotProduct V1 V2] end rule rule addDotProduct V1 [repeat number] V2 [repeat number] deconstruct V1 First1 [number] Rest1 [repeat number] deconstruct V2 First2 [number] Rest2 [repeat number] construct ProductOfFirsts [number] First1 [* First2] replace [number] N [number] by N [+ ProductOfFirsts] [addDotProduct Rest1 Rest2] end rule DotProduct.Txl COSC6431
Sort.Txl % Sort.Txl - simple numeric bubble sort define program [repeat number] end define rule main replace [repeat number] N1 [number] N2 [number] Rest [repeat number] where N1 [> N2] by N2 N1 Rest end rule COSC6431
Other TXL constructs compounds -> := end compounds keys var procedure exists inout out end keys function isAnAssignmentTo X [id] match [statement] X := Y [expression] end function COSC6431
www.txl.ca • Guided Tour • Many examples • Reference manual • Download TXL for many platforms COSC6431
Example uses • HTML Pretty Printing of Source Code • Language to Language Translation • Design Recovery from Source • Improvement of security problems • Program instrumentation and measurement • Logical formula simplification and interpretation. COSC6431