190 likes | 342 Views
Languages and Compiler Design II Program Interpretation. Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 5/13/2010. Agenda. Interpreters Low-Level Interpretation MINI Data Structure Fetch-Execute Cycle MINI Function Calls
E N D
Languages and Compiler Design IIProgram Interpretation Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 5/13/2010 CS322
Agenda • Interpreters • Low-Level Interpretation • MINI Data Structure • Fetch-Execute Cycle • MINI Function Calls • Interpretation Issues • Program Loading • Simulated Memory Cell • Simulated Stack • Simulate Arithmetic CS322
Interpreters Definition: An interpreter reads a source program and executes it directly • Advantage: • Easier to write than compilers • Portable, and “executes” before HW is available • Drawbacks: • Slower than running compiled code Levels of Interpretation • Low-Level Language (Including IR Code) Interpretation: • Input is sequence of simple instructions • single name-space, although functions may have local data • High-Level Language Interpretation: • program structure can be complex: declarations, data types, modules, exceptions, etc. • multiple name-spaces, with variables CS322
Interpreters, Cont’d • Compiler: • Front-End: Reads source, scans, and parses • Parser generates errors if applicable, builds symbol table, and generates IR or direct code; code may be incomplete • IR-generation may have target “in mind”, or may be target-independent and use shared interface • Shared IR can be used for multiple source language mappings • Back-End: Optimizes IR, or peep-hole optimizer improves code • Completes unresolved addresses; can be done by post-pass, e.g. during separate link-step • Interpreter: • Front-End: Reads source, scans, and parses; like compiler FE • Parser generates errors if applicable, builds symbol table, and either executes statements directly or emits IR • IR then is executed directly, generally with little or no optimization CS322
Interpreters, Cont’d • Simulator: • Simulator is SW implementation of architecture HW interface • Is separate from Front-End, acts like a HW machine • Requires a separate FE that has mapped source to IR • That IR is the machine architecture • Needs to model: • Memory, with portion being stack, code, heap, static space • Memory must be usable as int, float, char[4], etc • Memory should model bytes or words, per architecture spec. • So SW implementation defines each memory cell as a union • Simulator has HW resources, e.g.: sp, fp, ip, condition code • fp AKA bp, ip AKA pc etc • Simulates GPR register file –or single accumulator • Must simulate special registers, e.g. Y register for imult, idiv • Register should be same type as memory cell: union of int, float CS322
Interpreters, Cont’d • Your project 4 assignment is really the implementation of a simulator for IR • Any simulator may actually separate code, data, stack, and heap space, or some combination • or all in one common data space, like real machine memory • The MINI simulator does not need to allow for memory (and stack and register) to be a union of floats, ints, chars • since you only use integer types • Your project 5 assignment is really a code generator (compiler back-end) that maps IR to SPARC assembly code CS322
Low-Level Interpretation Interpreter manages a simple memory model, has instruction space code[], program counter pc, executes fetch-execute cycle: while( next_op = code[ pc++ ].opcode != halt_op ) { execute( next_op ); } //end while Example: Interpreting an assembly language for a machine with only one register, the accumulator: CS322
MINI Data Structure public class Instruction { public static final byte LOAD = 0, STORE = 1, MOVE = 2, ADD = 3, SUB = 4, JUMP = 5, JUMPZ = 6, HALT =7; public byte op; public short d; } //end Instruction public class State { public static final short CODESIZE=4096, DATASIZE=4096; public static final byte RUNNING=0, HALT=1, FAILED=2; public Instruction[] code = new Instruction[ CODESIZE ]; public short[] data = new short[ DATASIZE ]; public short PC, ACC; public byte status; } //end State CS322
Fetch-Execute Cycle public class Interpreter extends State { public void load() {...} // load the IR program; start address? public void go() { PC = 0; ACC = 0; status = RUNNING; do{ Instruction inst = code[ PC++ ]; short d = inst.d; switch( inst.op ) { case LOAD: ACC = data[ d ]; break; case STORE: data[ d ] = ACC; break; case MOVE: ACC = d; break; case ADD: ACC += data[ d ]; break; case SUB: ACC -= data[ d ]; break; case JUMP: PC = d; break; case JUMPZ: if ( ACC == 0 ) PC = d; break; case HALT: status = HALTED; break; default: status = FAILED; } //end switch } while ( status == RUNNING ); // or instruction != HALT } //end Interpreter CS322
MINI Function Calls void interpFunc( FUNC f ) { sp = sp - f.varCnt - f.argCnt - 1; // allocate an AR on the stack do { // the main interpretation loop Instruction inst = code[ PC++ ]; short d = inst.d; switch( inst.op ) { ... Other case-es case CALL: for( int i = 0; i < args.size(); i++ ) mem[ sp+i+1 ] = <arg i>; mem[ sp ] = fp; // like we saw on x86 fp = sp; // also like x86 sample interpFunc( g ); sp = fp; fp = mem[sp]; ... } //end switch } while ( status == RUNNING ); sp = sp + f.varCnt + f.argCnt + 1; // de-allocate stack frame } //end interpFunc CS322
Implementation Issues • Maintain environment information: Associate storage with variables, temps, etc. and track their value changes: • variables — store in a symbol table • temps — store in a register list • arguments — store in a argument list • constants — store in the symbol table • Follow the control flow of the source program: • Simple statements — just follow through • Jumps — need to find the target of a jump • Function calls — need to find the function body • Need to keep the entire source code around, and maintain a table of jump-targets, so that the interpreter can follow a jump to its target CS322
Program Loading • In an Interpreter, any operation can be “executed” directly, or stored in IR form, to allow for branches, calls, jumps etc., and iteration; for MINI, store all code • Simulator required FE to emit IR, and then “executes” each instruction • IR is gen’ed by FE and then loaded into code[] data structure, if separate passes or phases • During such a load, the total code size is known • Also first instruction needs to be known, which is first instruction of main() method • First can be loaded separately –in addition to IR stream– or be assumed to be 0 • Load is trivial if all work is done in single object program CS322
Simulated Memory Cell /* If the type of ``f'' changes from double to float, then all other * run-time quantities of floating-point type must be changed as * well. This includes a large number of things, including: * * formats for printf( "%lf" ) in simulator */ typedef union if_tp { int i; double f; } if_union_tp; typedef int Mem_Range; /* should be range 0 .. MAX_MINI_PLUS_MEMORY - 1. * Do NOT make unsigned; must stay int. * Sometime might become negative. */ #define NIL_ADDRESS -1 /* 0 is valid address on MINI_PLUS machine */ CS322
Simulated Stack /* T h e Simulated S t a c k * * Stack s[] consists of MAX_MINI_PLUS_MEMORY elements of * unions, can hold int or float. Chars, bools etc, in * MINI_PLUS Run Time System can be allocated 1 element/word */ if_union_tp s[ MAX_MINI_PLUS_MEMORY ]; Example: s[ some_address ].i = some_integer_value; /* T h e Simulated R e g i s t e r F i l e * * Virtual Machine has MAX_REG_NUM temporaries, which on * real machine are registers. Large number here avoids register * allocation task, which simplifies simulation work. */ if_union_tp r[ MAX_REG_NUM ]; Example: f_val = r[ integer_in_range_of_register_file ].f; CS322
Simulate Execution Unit #define IS_LEGAL( pc ) ( ( pc != NIL_ADDRESS ) && ( pc <= next_q ) ) /* Current # of cycles can be modified. E.g., a memory reference in a load * or store instruction adds cycles. Also, an indirect operation * adds a cycle or a processors blocked by memory access contention. */ PRIVATE void interpret_MINI_PLUS( VOID ) { /* interpret_MINI_PLUS */ unsigned tot_instructions = 0; unsigned tot_cycles = 0; Quad_Range previous_pc; MINI_IR_class pc; // Next MINI_PLUS IR being executed CS322
Simulate Execution Unit, Cont’d do { /* for every instruction, at least the HALT instruction */ previous_pc = pc; quad = q[ pc ]; cycles = cycle_count[ quad.op ]; ++ic; /* count instructions executed */ ++pc; /* may be overriden, by control-flow instruction; default is next quad */ switch ( quad.op ) { case q_multi: case q_negi: case q_expoi: case q_subi: case q_ipred: case q_modi: case q_f2i: case q_remi: integer_operation( quad ); break; . . . lots of other operations . . . case q_halt: // done break; default: MINI_PLUS_ERR( "forgot quad class d(%d)\n", quad.op ); } /*end switch*/ /* Done with this instruction; count cycles used. * Then proceed with next MINI_PLUS IR instruction AKA quad. */ seq_cycles += cycles; } while ( ( quad.op != q_halt ) && IS_LEGAL( pc ) ); run_time_statistics( tot_instructions, tot_cycles, seq_cycles, max_processors ); } /*end interpret_MINI_PLUS*/ CS322
Simulate Integer Exponentiation /* Determine overflow. Handle sign of ``base'' separate from computation * by dealing with ``base'' alone; in the end, determine sign of base * and return result of proper sign. If along the way the sign is flipped, * produce run-time error, cos overflow is detected. * * Accesses global: cycles * * Uses functions: RTC to check run-time errors * */ PRIVATE int ipower( int base, int expo ) CS322
Simulate Integer Exponentiation PRIVATE int ipower( int base, int expo ) { /* ipower */ int i; int result = 1; BOOL negative = BFALSE; RTC( ( expo < 0 ), NEG_EXPONENT_EXC ); if ( base == 0 ) { return 0; }else if ( base == 1 ) { return 1; }else if ( base < 0 ) { negative = BTRUE; base = -base; } /*end if*/ /* base will be positive */ for ( i = 0; i < expo; i++ ) { result *= base; RTC( ( result < 0 ), INT_OVERFLOW_EXC ); } /*end for*/ cycles += expo; if ( ( expo % 2 ) && ( negative ) ) { return -result; }else{ return result; } /*end if*/ } /*end ipower*/ CS322
Simulate Boolean Operations PRIVATE void boolean_operation( q_node_struct_tp q ) { /* boolean_operation */ BOOL I = BFALSE; // need indirection? If so, classification .cl says: pointer if ( q.res.cl == a_reg ) { switch ( q.op ) { case q_and: r[ q.RES_REG ].i = ival( q.arg1 ) && ival( q.arg2 ); break; case q_not: r[ q.RES_REG ].i = !ival( q.arg1 ); break; case q_or: r[ q.RES_REG ].i = ival( q.arg1 ) || ival( q.arg2 ); break; case q_xor: r[ q.RES_REG ].i = ival( q.arg1 ) != ival( q.arg2 ); break; default: AVM_ERR( "wrong boolean_operation temp %d\n", q.op ); } /*end switch*/ }else{ /* must be a variable */ I = q.res.cl == a_ptr; switch ( q.op ) { case q_and: iwrite_mem( I, q, ival( q.arg1 ) && ival( q.arg2 ) ); break; case q_not: iwrite_mem( I, q, !ival( q.arg1 ) ); break; case q_or: iwrite_mem( I, q, ival( q.arg1 ) || ival( q.arg2 ) ); break; case q_xor: iwrite_mem( I, q, ival( q.arg1 ) != ival( q.arg2 ) ); break; default: AVM_ERR( "wrong boolean_operation variable %d\n", q.op ); } /*end switch*/ } /*end if*/ } /*end boolean_operation*/ CS322