260 likes | 547 Views
Transformation of C code to Matlab/Simulink models Approach based on parsing. Vilmos Zsombori 4.01.2005, Shanghai. Motivation. Considerations: The amount of source code to be transformed Transformation patterns
E N D
Transformation of C code to Matlab/Simulink modelsApproach based on parsing Vilmos Zsombori4.01.2005, Shanghai
Motivation • Considerations: • The amount of source code to be transformed • Transformation patterns • Existence of third party parser generators that allow to automatically generate a customized parser for a user defined grammar • Prerequisite of automatic transformation: • Customized lexical analyzer • Customized syntactic analyzer • Implementation of output actions • Benefits: • The tool would reduce a large amount of transformation time • The automatic transformation is not susceptible to “human” errors • The actions are attached to language structures, in consequence the transformation is uniform, based on patterns and modeling guidelines • In case of library upgrades, name changes or pattern changes, the model update requests incomparably less time than the manual approach
Key concept • Based on lexical and syntactic analysis • Automatic parser generation • Action specification according to the transformation patterns and the modeling guidelines Approach General architecture cpp.jj grammar specification parser specification action specification Synthesis (JavaCC) CPPParser.java and C2Model.java Customized parser and code generator
General architecture cpp.jj grammar specification parser specification action specification Synthesis (JavaCC) CPPParser.java and C2Model.java Customized parser and code generator
Lexical analysis (tokens – regular expressions) SKIP : { “ ” | “\n” | <“/*” … “/”> } TOKEN : { <#DECIMAL_LITERAL: ["1"-"9"] (["0"-"9"])*> | <STRING_LITERAL: "\"" … "\""> … } TOKEN : { <CONTINUE: "continue"> | <VOLATILE: "volatile"> | <TYPEDEF: "typedef"> | <IF: "if"> | <DO: "do"> … } TOKEN : { <IDENTIFIER: <LETTER> (<LETTER> | <DIGIT>)*> | <#LETTER: ["$","A"-"Z","_","a"-"z"]> | <#DIGIT: ["0"-"9"]> } Lexical and Syntactic analysis (I)
Syntactic analysis (C grammar) translation_unit : external_declaration | translation_unitexternal_declaration; external_declaration : function_definition | declaration; function_definition : declaration_specifiers declarator declaration_list compound_statement | declaration_specifiers declarator compound_statement | declarator declaration_list compound_statement | declarator compound_statement; selection_statement : IF '(' expression ')' statement | IF '(' expression ')' statement ELSE statement | SWITCH '(' expression ')' statement; Lexical and Syntactic analysis (II)
General architecture cpp.jj grammar specification parser specification action specification Synthesis (JavaCC) CPPParser.java and C2Model.java Customized parser and code generator
Parser generation (I) • Based on lexical and syntactic descriptions • Available parser generators: • Lex + Yacc • Generation of C code for the parser • Tool freely available for Linux but NOT for Windows • Needs a C compiler (gcc) • JavaCC • Generation of Java code for the parser • Needs a Java compiler and a Java Virtual Machine at runtime • Both, JDK and JavaCC are FREE-ly available • The generated parser is platform independent • The current work is based on JavaCC
How JavaCC works Lexical Analyzer CPPParserTokens.java CPPParserConstants.java TokenManagerError.java JavaCC Source cpp.jj JavaCC Compiler Syntax Analyzer CPPParser.java ParseException.java Parser generation (II)
General architecture cpp.jj grammar specification parser specification action specification Synthesis (JavaCC) CPPParser.java and C2Model.java Customized parser and code generator
Action specification (I) • The actions are attached to the syntactic structures • Based on: • Transformation patterns: • Function: • If-then-else structure: • Modeling guidelines • Data flow, naming, colors, … voidsign_init(void){ …} if ( expression ) { statement1 } else { statement2 }
Action specification (II) • Block construction is possible through the Matlab script language: • Implementation of the actions: • Java class hierarchy –Node class subtype • Each class implements the process() method in a specific way, wrapping the • output action; e.g. EqualityExpression: • String process ( PrintWriter outputStream, String prefix ) { • String rightHandSide = getChild(0).process(outputStream, prefix); // process the right-hand side • String rightHandSide = getChild(1).process(outputStream, prefix); // process the left-hand side • outputStream.print(“add_block(built-in/Relational Operator‘“ + ”, ‘” + prefix + ”/eq’, Operator’, ‘==’)”); • outputStream.print(“add_line(‘” + prefix + ”’, “ + rightHandSide + ”/1’, ‘eq/1’)”); • outputStream.print(“add_line(‘” + prefix + ”’, “ + leftHandSide + ”/1’, ‘eq/2’)”); // interconnect the blocks • return “eq/1”; // return the output port • } add_block(…) { blocks for the logical expression } add_block(‘built-in/Subsystem’, ‘{prefix}/statement1’) add_block(‘built-in/Subsystem’, ‘{prefix}/statement2’) add_block(‘built-in/Logical Operator’, ‘…/not’, ‘Operation’, ‘NOT’) add_line(‘{prefix}’, ‘{logical exprn out}/1’, ‘statement1/Enable’) add_line(‘{prefix}’, ‘{logical exprn out}/1’, ‘not/1’) add_line(‘{prefix}’, ‘not/1’, ‘statement2/Enable’)
General architecture cpp.jj grammar specification parser specification action specification Synthesis (JavaCC) CPPParser.java and C2Model.java Customized parser and code generator
Process overview Source code sign.c C2Model.jar (customized parser and code generator) Matlab script sign.m Matlab Simulink model sign.mdl The translation process C2Model.jar – customized parser and code generator • Builds up • a parse tree from the source code according to the lexical/syntactic definitions • consists of Node class subtypes • the nodes wrap the appropriate output actions • symbol tables – attached to Scopes: • type-, variable-, function- and port-table • efficient - using hash map
Source C file (test.c) Parse tree typedef struct { int i, j; } stru; void fcn1(void) { a = b; } int fcn2(int u, int v) { if ((a<b) && (c+d <= e)) stru.i = u + v; } intfcn3(int h) { if (a == b) { u = c + d; } else { v = r + t; } } TranslationUnit [4] … ExternalDeclaration [1] FunctionDefinition [3] DeclarationSpecifiers [1] TypeSpecifier [0] Declarator [1] DirectDeclarator [1] token: fcn1 … CompoundStatement [1] StatementList [1] Statement [1] ExpressionStatement [1] Expression [1] AssignmentExpression [3] UnaryExpression [1] PostfixExpression [1] PrimaryExpression [0] token: a AssignmentOperator [0] AssignmentExpression [1] … PrimaryExpression [0] token: b … The translation process – example C2Model.jar
Matlab script (test.m) new_system('test') add_block('built-in/Subsystem','test/fcn1') add_block('built-in/Outport','test/fcn1/a') set_param('test/fcn1/a', …) add_block('built-in/Inport','test/fcn1/b') set_param('test/fcn1/b', …) add_line('test/fcn1', 'b/1', 'a/1') … save_system('test') Simulink model The translation process – example(ctd.) Parse tree C2Model.jar Matlab
Achievements and open questions • The lexical and syntactic definitions are complete • The symbol tables for each scope are constructed correctly • Type-, variable-, function- and port-table • The actions are implemented for: • All binary and n-ary operations • Assignment and conditional expressions • Selection statement, jump statement, iteration statement • Bus systems are created for structures • Bus selectors are used for operations on structure members • Operational on the entire sign.c • Open questions: • Handling of function-calls • Handling of arrays
Conclusion • A new automated translation method has been explored, based on parsing and a third party compiler generator, which transforms source code written in C/C++ to Matlab/Simulink models. • This yields fast and error-free operation, and has proven capable of handling large source codes without human intervention. • Although there are some issues concerning the organization of the output models (localization and positioning), results at this stage are encouraging. • Switching the implemented actions, the parser can be adapted to any “bit-by-bit transformation”. • However, the focus of this approach is the pure source code, not the logic and the functionality. • This is the reason, why it is unable to meet the simplification and the re-engineering issues, which are among the essential objectives of the entire project. • Therefore the developed tool could only assist the transformation work.