210 likes | 298 Views
The TXL Programming Language (2). Mariano Ceccato ITC-Irst Istituto per la ricerca Scientifica e Tecnologica ceccato@itc.it. The three phases of TXL. Transformed parse tree. Output text. Input text. Parse tree. Parse. Transform. Unparse. [words]. [words]. “blue fish”. “marlin”.
E N D
The TXL Programming Language (2) Mariano Ceccato ITC-Irst Istituto per la ricerca Scientifica e Tecnologica ceccato@itc.it The TXL Programming Language (2)
The three phases of TXL Transformed parse tree Output text Input text Parse tree Parse Transform Unparse [words] [words] “blue fish” “marlin” [word] [empty] [word] [words] [word] [empty] blue marlin The TXL Programming Language (2) fish
Anatomy of a TXL program The base grammar defines the lexical forms (tokens or terminals) and the syntactic forms (non-terminals). • Base grammar • Grammar overrides • Transformation rules The optional grammar overrides non-terminal of the base grammar. The ruleset defines the set of transformation rules and functions The TXL Programming Language (2)
Anatomy of a TXL program Example: • Base Grammar • Grammar overrides • Transformation rules Expr grammar include “Expr.Grammar” redefine expr … | exp([number], [number])) include “Expr-exp.Grammar” rule main rule one rule two The TXL Programming Language (2)
Specifying Lexical Forms • Lexical forms specify how the input is partitionated into tokens. • Predefined defaults include identifiers [id] (e.g. ABC, rt789), integer and float [number] (e.g. 123, 123.23, 3e22), string [string] (e.g. “hi there”). • The tokens statement gives regular expressions for each class of token in the input language. Example: tokens hexnumber “0[xX][\dABCDEFabcdef]+” end tokens The TXL Programming Language (2)
Specifying lexical Forms (cont’d) tokens name “regular expression” end tokens • Any single char (not [, ]) not preceded by a \ or # simply represents itself. • Single char patterns: ex. \d (digits), \a (alphabetic char). • Regular expression operators: [PQR] (any one of), (PQR) (sequence of), P*, P+, P?. Regular expression: The TXL Programming Language (2)
Specifying lexical Forms (cont’d) keys procedure repeat ‘program end keys compounds := >= <= end compounds comments /* */ // end comments • The keys specifies that certain identifiers are to be treated as unique special symbols. • The compounds specifies char seuqences to be treated as a single terminal. • The comments specifies the commenting conventions of the input language. By default comments are ignored by TXL. The TXL Programming Language (2)
Specifying Syntactic Forms • The general form of a non-terminal is: define name alternative1 | alternative2 … | alternativeN end define • Where each alternative is any sequence of terminal and non terminal (enclosed in square brackets). • The special type [program] describes the structure of the entire input. The TXL Programming Language (2)
Specifying Syntactic Forms (cont’d) • Extended BNF-like sequence notation: [repeat x] sequence of zero or more (X*) [list X] comma-separated list [opt X] optional (zero or one) … are equivalent define statements [statement] | [statement] [statements] end define define statements [repeat statement] end define The TXL Programming Language (2)
Specifying Syntactic Forms (cont’d) key procedure begin ‘end int bool end key define proc ‘procedure [id] [forrmalParameters] ‘begin [body] ‘end end define define formalParameters ‘([list formalParameter+]’) | [empty] end define define formalParameter [id] ‘: [type] end define define type ‘int | ‘bool end define The TXL Programming Language (2)
Ambiguity • TXL resolves ambiguities by choosing the first alternative of each non-terminal that can match the input. T T Example: T-language define T [number] | ([T]) | + [T] | + + [T] end define + T ++ T ++2 + T 2 2 The TXL Programming Language (2)
Transformation rules • TXL has two kinds of transformation rules, rules and functions, which are distinguished by whether they should transform only one (for functions) or many (for rules) occurrences of their pattern. • Rules search their scope for the first istance of their target type matching their pattern, transform it, and then reapply to the entire scope until no more matches are found. • Functions do not search, but attempt to match only their entire scope to their pattern, transforming it if it matches. The TXL Programming Language (2)
Rules and function function 2To42 replace [number] 2 by 42 end function 2 ----> 42 3 2 6 2 78 4 2 Rules search the pattern! rule 2To42 replace [number] 2 by 42 end rule 2 ----> 42 3 2 6 2 78 4 2 ----> 42 6 42 78 4 42 The TXL Programming Language (2)
Searching functions function 2To42 replace * [number] 2 by 42 end function 2 ----> 42 3 2 6 2 78 4 2 ----> 42 6 2 78 4 2 Note: change only * The TXL Programming Language (2)
Syntax of rules and functions Simplified and given in TXL. ‘rule [ruleid] [repeat formalArgument] [repeat construct_deconstruct_where] ‘replace [type] [pattern] [repeat construct_deconstruct_where] ‘by [replacement] ‘end rule The same for functions! N.B. If the ‘where-condition’ is false the rule can not be applied and the result is the input-AST. The TXL Programming Language (2)
Built-in functions rule resolveAdd replace [expr] N1 [number] + N2 [number] by N1 [add N2] end rule function add … end function rule resolveAdd replace [expr] N1 [number] + N2 [number] by N1 [+ N2] end rule … are equivalent! The TXL Programming Language (2)
Built-in functions (cont’d) rule sort replace [repeat number] N1 [number] N2 [number] Rest [repeat number] where N1 [> N2] by N2 N1 Rest end rule 22 4 2 15 1 ------> …. ------> 1 2 4 15 22 The TXL Programming Language (2)
Recursive functions function fact replace [number] n [number] construct nMinusOne [number] n [- 1] where n [> 1] construct factMinusOne [number] nMinusOne [fact] by n [* factMinusOne] end function The TXL Programming Language (2)
Using rule parameters rule resolveConstants replace [repeat statement] ‘const C [id] = V [expr] RestOfscope [repeat statement] by RestOfScope [replaceByValue C V] end rule rule replaceByValue ConstName [id] Value [expr] replace [primary] ConstName by (Value) end rule Example: Const Pi = 3.14; Area := r*r*Pi; Area := r*r*3.14; The TXL Programming Language (2)
Exercises • Implementing the T-language (page 11). • Implementing the Calculator.txl. • Adding to the ‘expr-grammar’ the exponential i.e Exp(x, n). Computing the exponential: - in syntax way: ex. Exp(2, 3) ----> 2*2*2 - in semantic way: by means a recursive function that substitute at Exp(x, n) the correct value. The TXL Programming Language (2)
Homework • Implementing a simple version of “commands-language” where commands can be: - assignments i.e. [id] := [expr]; - declarations i.e. const [id] = [number]; • Implementing some transformation rules (page 19) that substitute in the assignments identifiers with related values. Example: Const Pi = 3.14; Area := r*r*Pi; Area := r*r*3.14; The TXL Programming Language (2)