850 likes | 1.15k Views
UNIT 1 INTODUCTION TO COMPILERS By :- Namratha Nayak. TOPICS. Language Processors Structure of a Compiler Evolution of Programming Languages Science of Building a Compiler Applications of Compiler Technology Programming Language Basics. Language Processors. COMPILER
E N D
UNIT 1INTODUCTION TO COMPILERSBy :- Namratha Nayak www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
TOPICS • Language Processors • Structure of a Compiler • Evolution of Programming Languages • Science of Building a Compiler • Applications of Compiler Technology • Programming Language Basics www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Language Processors • COMPILER • Read a program in source language and translate into the target language • Source language – High-level language like C, C++ • Target language – object code of the target machine • Report any errors detected in the source program during translation www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Language Processors • INTERPRETER • Directly executes the operations specified in the source program on inputs supplied by the user • Target program is not produced as output of translation • Gives better error-diagnostics than a compiler • Executes source program statement by statement • Target program produced by compiler is much faster at mapping inputs to outputs www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Language Processors • EXAMPLE • Java language processors combine compilation and interpretation • Source program is first compiled into bytecodes • Bytecodes are then interpreted by a virtual machine • Just-in-time compilers • Translate bytecodes into machine language before they runt he intermediate program to process input www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Language Processors • A Language-Processing System www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Language Processors • Preprocessor • Source program may be divided into modules in separate files • Accomplishes the task of collecting the source program • Can delete comments, include other files, expand macros • Assembler • Compiler produces an assembly-language program • Symbolic form of the machine language • Produces relocatable machine code as output www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Language Processors • Linker/Loader • Relocatable Code • Not ready to execute • Memory references are made relative to an undetermined starting address in memory • Relocatable machine code may have to be linked with other object files • Linker • Resolves external memory addresses • Code in file referring to a location in another file • Loader • Resolve all relocatable addresses relative to a given starting address • Puts together all the executable object files into memory for execution www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
The Structure of a Compiler • Analysis Phase • Break up source program into token or constituent pieces • Impose a grammatical structure • Create an intermediate representation of the source program • If source program is syntactically incorrect or semantically wrong • Provide informative messages to the user • Symbol Table • Stores the information collected about the source program • Given to the synthesis phase along with the intermediate representation www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
The Structure of a Compiler • Synthesis Phase • Constructs the desired target program from • Intermediate representation • Information in symbol table • Back end of the compiler • Analysis phase is called front end of the compiler • Compilation process is a sequence of phases • Each phase transforms one representation of source program into another • Several phases may be grouped together • Symbol table is used by all the phases of the compiler www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
The Structure of a Compiler www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Lexical Analysis • Lexical Analyzer • Reads stream of characters in the source program • Groups the characters into meaningful sequences – lexemes • For each lexeme, a token is produced as output <token-name , attribute-value> • Token-name : symbol used during syntax analysis • Attribute-value : an entry in the symbol table for this token • Information from symbol table is needed for syntax analysis and code generation • Consider the following assignment statement www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Syntax Analysis • Parsing • Parser uses the tokens to create a tree-like intermediate representation • Depicts the grammatical structure of the token stream • Syntax tree is one such representation • Interior node – operation • Children - arguments of the operation • Other phases use this syntax tree to help analyze source program and generate target program www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Semantic Analysis • Semantic Analyzer • Checks semantic consistency with language using: • Syntax tree • Information in symbol table • Gathers type information and save in syntax tree or symbol table • Type Checking • Checks each operator for matching operands • Ex: Report error if floating point number is used as index of an array • Coercions or type conversions • Binary arithmetic operator applied to a pair of integers or floating point numbers • If applied to floating point and integer, compiler may convert integer to floating-point number www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Semantic Analysis • Semantic Analyzer • For our assignment statement • Position, rate and initial are floating-point numbers • Lexeme 60 is an integer • Type checker finds that * is applied to floating-point ‘rate’ and integer ‘60’ • Integer is converted to floating-point www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Intermediate Code Generation • After syntax and semantic analysis • Compilers generate machine-like intermediate representation • This intermediate representation should have the two properties: • Should be easy to produce • Should be easy to translate into target machine • Three-address code • Sequence of assembly-like instructions with three operands per instruction • Each operand acts like a register www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Intermediate Code Generation • Points to be noted about three-address instructions are: • Each assignment instruction has at most one operator on the right side • Compiler must generate a temporary name to hold the value computed by a three-address instruction • Some instructions have fewer than three operands www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Code Optimization • Attempt to improve the target code • Faster code, shorter code or target code that consumes less power • Optimizer can deduce that • Conversion of 60 from int to float can be done once at compile time • So, the inttofloat can be eliminated by replacing 60 with 60.0 • t3 is used only once to transmit its value to id1 www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Code Generation • Code Generator • Takes intermediate representation as input • Maps it into target language • If target language is machine code • Registers or memory locations are selected for each of the variables used • Intermediate instructions are translated into sequences of machine instructions performing the same task • Assignment of registers to hold variables is a crucial aspect • First operand of each instruction specifies a destination www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Symbol-Table Management • Essential function of Compiler • Record variable names used in source program • Collect information about storage allocated for a name • Type • Scope – where in the program the value may be used • In case of procedure names, • Number and type of its argument • Method of passing each argument • Type returned • Symbol Table • Data structure containing a record for each variable name with fields for attributes www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Compiler-Construction Tools • Commonly used compiler-construction tools • Parser Generators • Scanner Generators • Syntax-directed translation engines • Code-generator Generators • Data-flow analysis engines • Compiler-construction Toolkits www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Evolution of Programming Languages • Move to Higher-Level Languages • Development of mnemonic assembly languages in 1950’s • Classification of Languages • Generation • First-generation : machine languages • Second-generation : assembly languages • Third-generation : C, C++, C#, Java • Fourth-generation : SQL, Postscript • Fifth-generation : Prolog • Imperative and Declarative • Imperative : how a computation is to be done • Declarative : what computation is to be done • Object-oriented Language • Scripting Languages www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Evolution of Programming Languages • Impact on Compilers • Advances in PL’s placed new demands on compiler writers • Devise algorithms and representations to support new features • Performance of a computer is dependent on compiler technology • Good software-engineering techniques are essential for creating and evolving modern language processors • Compiler writers must evaluate tradeoffs about • What problems to deal with • What heuristics to use to approach the problem of generating efficient code www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Science of Building a Compiler • Modeling in Compiler Design and Implementation • Study of compilers is a study of how • To design the right mathematical models and • Choose the right algorithms • Finite-state machines and regular expressions • Useful for describing the lexical units of a program (keywords, identifiers) • Used to describe the algorithms used to recognize those units • Context-free Grammars • Describe syntactic structure of PL • Nesting of parentheses, control constructs www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Science of Building a Compiler • Science of Code Optimization • “Optimization” – an attempt to produce code that is more efficient • Processor architectures have become more complex • Important to formulate the right problem to solve • Need a good understanding of the programs • Compiler design must meet the following design objectives • Optimization must be correct, i.e., preserve the meaning of compiled program • Optimization must improve the performance of many programs • Compilation time must be kept reasonable • Engineering effort required must be manageable www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Applications of Compiler Technology • Implementation of high-level programming languages • High-level programming language defines a programming abstraction • Low-level language have more control over computation and produce efficient code • Hard to write, less portable, prone to errors and harder to maintain • Example : register keyword • Common programming languages (C, Fortran, Cobol) support • User-defined aggregate data types (arrays, structures, control flow ) • Data-flow optimizations • Analyze flow of data through the program and remove redundancies • Key ideas behind object oriented languages are • Data Abstraction • Inheritance of properties www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Applications of Compiler Technology • Implementation of high-level programming languages • Java has features that make programming easier • Type-safe – an object cannot be used as an object of an unrelated type • Array accesses are checked to ensure that they lie within the bounds • Built in garbage-collection facility • Optimizations developed to overcome the overhead by eliminating unnecessary range checks www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Applications of Compiler Technology • Optimizations for Computer Architectures • Parallelism • Instruction level : multiple operations are executed simultaneously • Processor level : different threads of the same application run on different processors • Memory hierarchies • Consists of several levels of storage with different speeds and sizes • Average memory access time is reduces • Using registers effectively is the most important problem in optimizing a program • Caches and physical memories are managed by the hardware • Improve effectiveness by changing the layout of data or order of instructions accessing the data www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Applications of Compiler Technology • Design of new Computer Architectures • RISC (Reduced Instruction-Set Computer) • CISC (Complex Instruction-Set Computer) – • Make assembly programming easier • Include complex memory addressing modes • Optimizations reduce these instructions to a small number of simpler operations • PowerPC, SPARC, MIPS, Alpha and PA-RISC • Specialized Architectures • Data flow machines, vector machines, VLIW, SIMD, systolic arrays • Made way into the designs of embedded machines • Entire systems can fit on a single chip • Compiler technology helps to evaluate architectural designs www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Applications of Compiler Technology • Program Translations • Binary Translation • Translate binary code of one machine to that of another • Allow machine to run programs compiled for another instruction set • Used to increase the availability of software for their machines • Can provide backward compatibility • Hardware synthesis • Hardware designs are described in high-level hardware description languages like Verilog and VHDL • Described at register transfer level (RTL) • Variables represent registers • Expressions represent combinational logic • Tools translate RTL descriptions into gates, which are then mapped to transistors and eventually to physical layout www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Applications of Compiler Technology • Program Translations • Database Query Interpreters • Languages are useful in other applications • Query languages like SQL are used to search databases • Queries consist of predicates containing relational and boolean operators • Can be interpreted or compiled into commands to search a database • Compiled Simulation • Simulation • Technique used in scientific and engg disciplines • Understand a phenomenon or validate a design • Inputs include description of the design and specific input parameters for that run www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Applications of Compiler Technology • Software Productivity Tools • Testing is a primary technique for locating errors in a program • Use data flow analysis to locate errors statically • Problem of finding all program errors is undecidable • Ways in which program analysis has improved software productivity • Type Checking • Catch inconsistencies in the program • Operation applied to wrong type of object • Parameters to a procedure do not match the signature • Go beyond finding type errors by analyzing flow of data • If pointer is assigned null and then dereferenced, the program is clearly in error www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Applications of Compiler Technology • Software Productivity Tools • Bounds Checking • Security breaches are caused by buffer overflows in programs written in C • Data-flow analysis can be used to locate buffer overflows • Failing to identify a buffer overflow may compromise the security of the system • Memory-management tools • Automatic memory management removes all memory-management errors like memory leaks • Tools developed to help programmers find memory management errors • Purify - dynamically catches memory management errors as they occur www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • The Static/Dynamic Distinction • What decision can the compiler make about a program • Static Policy - Language uses a policy that allows compiler to decide an issue, i.e., at compile time • Dynamic Policy – Policy that allows a decision to be made when we execute the program, i.e. at run time • Scope of Declarations • Scope declaration of x is the region of the program in which uses of x refer to this declaration • Static or Lexical scope : Used if it is possible to determine the scope of a declaration by looking only at the program • Dynamic Scope : As the program runs, the same use of x could refer to any several different declaration of x • Example : public static int x; www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Environments and States • Whether the changes that occur as the program is run • Affects the values of the data elements • Affect interpretation of names for that data • Association of names with locations on memory (store) and then with values is described as a two-stage mapping • Environment – Mapping from names to locations in the store • State – Mapping from locations in store to their values. It maps l-values to their corresponding r-values www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Environments and States • Example • Exceptions to environment and state mappings • Static versus dynamic binding of names to locations • Static versus dynamic binding of locations to values www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Static Scope and Block Structure • Scope rules for C – based on program structure • Scope of a declaration – determined by the location of its appearance • Languages like C++,C# and Java provide explicit control over scopes – public, private and protected • Static scope rules for a language with blocks – a grouping of declarations and statements • C static scope policy is as follows: • C program is a sequence of top-level declarations of variables & functions • Functions may have variable declarations within them, scope of which is restricted to the function in which it appears • Scope of a top-level declaration of a name x consists of the entire program that follows www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Static Scope and Block Structure • The syntax of blocks in C is given by • It is a type of statement and can appear anywhere that other statements can appear • Is a sequence of declarations followed by a sequence of statements, all surrounded by braces • Block structure – nesting property of blocks • Static scope rule for variable declaration is as follows: • If declaration D of name x belongs to block B, • Then scope of D is all of B, except for any blocks B’ nested to any depth within B in which x is redeclared www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Static Scope and Block Structure • Blocks in a C++ program www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Explicit Access Control • Classes and structures introduce new scope for their members • If p is an object of a class with a field x, then use of x in p.x refers to field x in the class definition • The scope of declaration x in a class C extends to any subclass C’, except if C’ has a local declaration of the same name x • Public, private and protected – provide explicit control over access to member names in a super class • In C++, class definition may be separated from the definition of some or all of its methods • A name x associated with the class C may have a region of code that is outside its scope followed by another region within its scope www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Dynamic Scope • Based on factors that can be known only when the program executes • A use of a name x refers to the declaration of x in the most recently called procedure with such a declaration • Macro expansion in the C preprocessor • Dynamic scope resolution is essential for polymorphic procedures www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Dynamic Scope • Method resolution in OOP • The procedure called when x.m() is executed depends on the class of the object denoted by x at that time • Example: • Class C with a method named m() • D is a subclass of C , and D has its own method named m() • There is a use of m of the form x.m(), where x is an object of class C • Impossible to tell at compile time whether x will be of class C or of the subclass D • Cannot be decided until runtime which definition of m is the right one • Code generated by compiler must determine the class of the object x, and call one or the other method named m www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Parameter Passing Mechanisms • How actual parameters are associated with formal parameters • Actual parameters – used in the call of a procedure • Formal parameters – used in the procedure definition • Call-by-Value • The actual parameter is evaluated or copied • Value is placed in the location belonging to the corresponding formal parameter of the called procedure • Computation involving formal parameters done by the called procedure is local to that procedure and actual parameters cannot be changed • In C, we can pass a pointer to a variable to allow that variable to be changed by the callee • Array names passed as parameters in C,C++ or Java give the called procedure what is in effect a pointer or reference to the array itself www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Parameter Passing Mechanisms • Call-by-Reference • Address of actual parameter is passed to the callee as the value of the corresponding formal parameter • Changes to formal parameter appear as changes to the actual parameter • Essential when the formal parameter is a large object, array or a structure • Strict call-by-value requires that the caller copy the entire actual parameter into the space of the corresponding formal parameter • Copying is expensive when the parameter is large • Call-by-Name • The callee executes as if the actual parameter were substituted literally for the formal parameter in the code of the callee www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Basics • Aliasing • Consequence of call-by-reference parameter passing • Possible that two formal parameters can refer to the same location • Such variables are said to be aliases of one another • Example: • a is an array belonging to procedure p, and p calls another procedure q(x,y) with a call q(a,a) • Parameters are passed by value but the array names are references to the location where the array is stored • So, x and y become aliases of each other • Understanding aliasing is essential for a compiler that optimizes a program www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
LEXICAL ANALYSIS www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Objectives • Role of Lexical analyzer • Lexical analysis using formal language definitions with Finite Automata • Specification of Tokens • Recognition of Tokens www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
Programming Language Structure • A Programming Language is defined by: • SYNTAX • Decides whether a sentence in a language is well-formed • SEMANTICS • Determines the meaning , if any, of a syntactically well-formed sentence • GRAMMAR • Provides a generative finite description of the language • Well developed tools (regular, context-free and attribute grammars) are available for the description of syntax • Lexical analyzer and the Parser handle the syntax of the programming language www.Bookspar.com | Website for Students | VTU - Notes - Question Papers
The Role of the Lexical Analyzer • Main task of lexical analyzer • Read input characters in a source program • Group them into lexemes • Produce as output a sequence of tokens for each lexeme • Stream of tokens is sent to the parser • Whenever a lexeme is found, it is entered into the symbol table www.Bookspar.com | Website for Students | VTU - Notes - Question Papers