20 likes | 89 Views
A Rewriting Approach to the Design and Evolution of Object-Oriented Languages. Mark Hills and Grigore Roş u Formal Systems Laboratory Department of Computer Science, University of Illinois Urbana-Champaign. KOOL Configuration Structure. Program Representation
E N D
A Rewriting Approach to the Design and Evolution of Object-Oriented Languages Mark Hills and Grigore RoşuFormal Systems LaboratoryDepartment of Computer Science, University of Illinois Urbana-Champaign KOOL Configuration Structure • Program Representation • Program configuration represented as term rewriting-style terms • Term representing a program consists of program instructions and state • State represented as “soup” (multiset) of nested state components • Nesting allows grouping of related state components • Also allows duplication where needed – useful to represent multiple threads (each thread has program instructions, local environment, etc), snapshot of current state (coroutines, exceptions, etc) • A Simple Term • This term represents a statement somewhere in the configuration. E, S, and S’ are variables over expressions and statements. Variables are universally quantified (E is any expression). • stmt(if E then S else S’ fi) • A More Complex Term • More complex terms generally need to mention enough context to include all needed parts of the configuration term. This term represents a memory lookup operation on location L, involving both the current computation and the memory. • t(control(k(llookup(L) -> K) CS) TS) mem(Mem) • Rewriting Logic Semantics • Programming language semantics are represented as transitions between terms • Proper transition at each step selected by matching part of configuration term with the transition • When matching, non-variable parts of transition much match part of current configuration term exactly; variables can match a subterm of the same sort, such as expression or statement • Matching requires enough context to incorporate all needed state • The entire term need not be matched at each step – makes transitions more modular (don’t mention what isn’t needed) • Each transition defined using a rule or an equation • Equations usually used to represent non-competing actions • Rules usually represent potentially competing actions (race conditions) • Execution steps kept in a computation (a continuation-like structure); operations added to left of -> to make them evaluate first, data needed to complete operation saved in computation items • Stacks used to track state needed to restore from jumps, such as exceptions or loop break/continue Equations Equations define actions which, in concurrent programs, will not compete. The following equations are used for a standard if-then-else statement. The first indicates that, to correctly choose the branch, we need to evaluate the guard expression first, saving the branches for later. The second and third pick the appropriate branch based on whether the guard is true or false. eq stmt(if E then S else S' fi) = exp(E) -> if(S,S') . eq val(primBool(true)) -> if(S,S') = stmt(S) . eq val(primBool(false)) -> if(S,S') = stmt(S') . Rules Rules define actions which could potentially compete in concurrent programs. The following rule is used to look up a location in memory, with memory shared by all threads. Memory is defined as a partial map, meaning that not all locations L need have an associated value V; this is why a conditional rule is used, verifying that the value at L is not undefined. crl t(control(k(llookup(L) -> K) CS) TS) mem(Mem) => t(control(k(val(V) -> K) CS) TS) mem(Mem) if V := Mem[L] /\ V =/= undefined . Reading Equations and Rules Both equations and rules can be unconditional (eq, rl) or conditional (ceq, crl), with = separating the left and right sides of equations and => separating the left and right sides of rules. By convention, variables are in caps. When a subterm matches a rule or equation, the variables are assigned the matching values, and this subterm is then replaced with the term on the right-hand side.
Program Evaluation and Analysis • Programs evaluated and analyzed directly in language semantics; currently uses Maude, support for other tools being examined • Standard evaluation is program execution, yields program result (output) • Analysis runs yield results based on type of analysis, for instance: • success or failure of type checking, • results of state space search, • model checking property verification or counterexample • Evaluation takes advantage of term rewriting • equations are oriented, rewriting from left to right • rules maintain left to right direction • both turned into standard term rewriting rules • Analysis takes advantage of logical nature of rewriting logic • equations identify equal terms, for equivalence class of terms • rules move between equivalence classes • provides method of analysis over equivalence classes, reducing state space • More advanced analysis (type checking, pre and post-condition checking, abstract interpretation) can be treated as evaluation using customized semantics – can alter concepts of what values are (types, abstract values, etc) and how language is evaluated Processing KOOL Processing Other Languages The above diagram shows the process used by KOOL, but a similar process is used for other languages. In all cases, the source is merged with any preludes; if the prelude is defined in Maude, this happens after parsing. A parser turns the input program into an AST and then into a Maude term, which is processed using the language semantics. The results of execution or analysis can be shown directly to the user or processed further if needed. • Language Definitions and Modularity • Multiple languages, including Java, Beta, and Scheme have been defined • Portions of many (Smalltalk, Python, etc) defined as class projects • Language definitions are modular • Feature syntax is defined with one module per feature or feature group (all arithmetic operations may be in one module, for instance) • Infrastructure operations that modify the underlying state are generally separate from semantics • Semantics rules given per feature, with different modules for different features and different semantics (evaluation vs. typing) • Modular definitions should allow for some feature reuse and should make languages easier to experiment with; also allows definitions to scale to large, complex languages • Actively investigating ways to improve modularity, allowing more flexible module systems to be developed for defining languages • Investigating improved tool support; tools will allow languages to be assembled from components, executions to be animated to improve understanding of semantics, and unit tests to be specified to allow for testing of languages and language features • Investigating integration with theorem provers, with a focus on standard language-theoretic proofs (progress and preservation proofs in type systems, for instance) • Java: Language and Bytecode Analysis • Java definition aimed at supporting analysis and verification of Java programs • Supports most language features as of JDK 1.4, including concurrency features • Does not yet support 1.5-specific features such as generics • Support provided for both Java language and Java JVM bytecode • Limitation: language layer works on source code, meaning library source must be available • Extending language-layer definition to full support of JDK 1.4 and Java 5 an ongoing project • Beta: Patterns and Experimentation • Beta definition aimed at program execution and language experimentation • Supports most Beta language features, including virtual patterns and alternation • Still under active development, with goal of supporting entire language, including fragments • Provides basic support for analysis of concurrent programs, including limited model checking support • Includes an experimental extension for super calls, to complement existing use of inner calls • Existing implementation makes use of Maude parser; moving to a more standard parser, allowing existing Beta programs to be parsed without modification • Related Publications • M. Hills and G. Rosu. KOOL: An Application of Rewriting Logic to Language Prototyping and Analysis. In Proceedings of RTA’07, LNCS 4533, pp 246-256. • M. Hills and G. Rosu. On Formal Analysis of OO Languages using Rewriting Logic: Designing for Performance. In Proceedings of FMOODS’07, LNCS 4468, pp 107-121. • J. Meseguer and G. Rosu. The Rewriting Logic Semantics Project, J. of TCS, Volume 373(3), pp 213-227, 2007. • M. Hills, T. Serbanuta and G. Rosu. A Rewrite Framework for Language Definitions and for Generation of Efficient Interpreters. In Proceedings of WRLA’06, ENTCS 176(4), pp 215 – 231. • G. Rosu. K: a Rewrite Logic Framework for Language Design, Semantics, Analysis and Implementation. Tech Report UIUCDCS-R-2006-2802, Department of Computer Science, UIUC, 2006. • F. Chen, M. Hills and G. Rosu. A Rewrite Logic Approach to Semantic Definition, Design and Analysis of Object-Oriented Languages. Tech Report UIUCDCS-R-2006-2702, Department of Computer Science, UIUC, 2006. • A. Farzan, J. Meseguer and G. Rosu. Formal JVM Code Analysis in JavaFAN. In Proceedings of AMAST’04, LNCS 3116, pp 132-147. • A. Farzan, F. Chen, J. Meseguer and G. Rosu. Formal Analysis of Java Programs in JavaFAN. In Proceedings of CAV’04, LNCS 3114, pp 501-505. • For Further Information • FSL website has all language definitions and related publications: http://fsl.cs.uiuc.edu • KOOL definition at http://fsl.cs.uiuc.edu/kool • Java definition at http://fsl.cs.uiuc.edu/java • Beta definition at http://fsl.cs.uiuc.edu/beta • Classroom notes on using this technique available at http://fsl.cs.uiuc.edu/index.php/Grigore_Rosu#classes • Can contact Mark Hills (mhills@cs.uiuc.edu) or Grigore Rosu (grosu@cs.uiuc.edu) (we’re both here at OOPSLA!)