1 / 104

Chap 5

Chap 5. Compilers. Basic compiler Functions . (1) A high-level programming language is usually described in terms of a grammar. This grammar specifies the syntax of legal statements in the language. Basic compiler Functions .

talia
Download Presentation

Chap 5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chap 5 Compilers

  2. Basic compiler Functions (1) Ahigh-levelprogramminglanguageis usuallydescribedintermsofagrammar. Thisgrammarspecifiesthesyntaxoflegal statementsinthelanguage.

  3. Basic compiler Functions The purposed of compiler construction, a high-level programming language is usually described in terms of a grammar. This grammar specifies the from be defined by the grammar as a variable name, assignment operator (:=), followed by an expression. Define by the grammar, and code for each statement.

  4. Basic compiler Functions (2) A source program statement is a sequence oftokens,whichmaybethoughtofasthe fundamentalbuildingblocksofthe language.

  5. Basic compiler Functions Tokens may be thought of as the fundamental building blocks of the language. For example, a keyword, a variable name, an integer, an arithmetic operator, etc. Classifying the various as lexical analysis. The part of the compiler that performs this is commonly called the scanner.

  6. Basic compiler Functions This process is called syntactic analysis or parsing, usually called the parser. Process is the generation of object code. The compilation process-scanning, parsing, and code generation.

  7. Basic compiler Functions (3) Characterstringsenclosedbetweenthe anglebrackets<and>arecalled nonterminalsymbols.Entriesnotenclosed inanglebracketsareterminalsymbolsof thegrammar.(i.e.tokens).

  8. Basic compiler Functions (4.) A grammar for the semantics, or meaning, of the various statement. Differencebetweensyntaxandsemantics: I:= J+ K and I:=X+Y

  9. Basic compiler Functions REAL variables and I, J, K are INTEGER variables have identical syntax. Each is an assignment statement; consists of two variable by the operator+. Statement specifies that the variables in the expression are to be added using integer arithmetic operations. Floating-point addition, floating point before adding. These two statements would be compiled into very different sequences of machine instructions during code generation.

  10. Basic compiler Functions • (5) • Compiler functions consists oflexicalanalysis,syntacticanalysisand codegeneration: • Lexicalanalysisinvolves scanning the program to be compiled and recognizing the tokens that make up the source statements. • Duringsyntactic analysis,thesourcestatementswritten by the programmer are recognized as language constructs described by the grammar being used. • When the parser recognizesa portion of the source program according to some rule of the grammar, the corresponding routine, calledsemanticroutine, is executed.

  11. Basic compiler Functions Writing grammars for Back-Naur Form. Consists of a set of rules, in the programming language. <read> ::=READ(<id-list>) The grammar as <read>. The symbol ::= can be read “is defined to be.” This symbol is the language construct being define of the syntax being defined for it.

  12. Basic compiler Functions Character strings enclosed between the angle brackets <and > are called nonterminal symbols. Entries not enclosed in angle brackets are terminal symbols of the grammar. Tokens READ, included only to improve readability.

  13. Basic compiler Functions <id-list> ::= id | <id-list>, id Token id, followed by the token ”,” (camera), followed by a token id. Defined partially in terms of itself. <id-list> that consists of a single id ALPHA; ALPHA, followed source statement in terms of grammar as a tree. The parse tree, or syntax tree shows the parse tree for the statement Rule 10 gives a definition of an <exp> :

  14. Basic compiler Functions The parse tree for statement 14 from Fig. 5.1 for performing this sort of syntactic analysis in a compiler. SUMSQ DIV 100 and MEAN * MEAN must be calculated first since these intermediate results for the – operation are implied by the way Rules are constructed. More than one possible parse tree, the grammar is said to be ambiguous. Unambiguous grammars in compiler construction, double about what object code should be generated

  15. Basic compiler Functions The program to be compiled and recognizing the tokens that make up the source statements. Keywords, operators, and identifiers might be defined by the rules <letter> ::= A | B | C | D | … | Z <digit> ::= 0 | 1 | 2 | 3 | … | 9 would interpret a sequence of such characters as construct <ident>. Special-purpose routine such as the perform this same function much more efficiently.

  16. Basic compiler Functions Both single- and multiple- character tokens directly. Single token rather than as a sequence of four token R, E, A, D.Approach creates considerably more work for the parser. The scanner consists of a sequence of tokens. Efficient of token is usually represented by some fixed-length code, as a variable-length character string.

  17. Basic compiler Functions Scanned is a keyword or an operator, such a coding scheme gives sufficient information. A token specifier with the type code for such tokens. Shows the output from a scanner for the program in Fig. 5.1, coding scheme in Fig. 5.5. This does not mean that the entire program is scanned at one time, other processing. The parser when it needs another.

  18. Basic compiler Functions The line of the source program as needed, printing the source listing. The source statements before parsing begins. The scanner must take into account any special format required of the source statements. In Columns 1-5 of a source statement should be interpreted as a statement number, not as an integer. Tokens may also vary from one part of the program to another.

  19. Figure 5.6 Lexical scan of the program from Fig. 5.1

  20. Basic compiler Functions Keyword, 10as a statement number, I as an identifier, etc. However, in the statement DO 10 I = 1 Remember that blanks in FORTRAN, represent either keywords or variable names defined by the programmer.

  21. Basic compiler Functions The parser so that it could tell the proper interpretation of each word, or it might simply place identifiers and keywords in the task of distinguishing between them to the parser.

  22. Basic compiler Functions The states is designated as the starting state, more states are designated as final states. Often represented graphically, as illustrated in Fig. 5.7(a). It stops when there is no transition from its current state character to be scanned.

  23. Basic compiler Functions The first input string in Fig. 5.7(b). Figure 5.7 Graphical representation of a finite automation

  24. Basic compiler Functions The a causes a transition from State 1 to State 2. The next character to be scanned is c in State 2. Recognizes tokens of the from abc…abc… where the grouping abc is repeated one or more times, and the c within each grouping may also be repeated.

  25. Basic compiler Functions Letter and may continue with any sequence of letters and digits. The notation A-Z, which indicated transition. Shows a finite automaton that recognizes identifiers of this type. End with an underscore, or that contain two consecutive underscores.

  26. Basic compiler Functions Seen so far was designed to recognize type of token. Finite automaton that can recognize the tokens listed in Fig. 5.5. All identifiers and keywords with one final state (State 2). Operation could then be used to distinguish keywords. Separate check could be made to ensure that identifiers permitted by the language definition.

  27. Basic compiler Functions “VAR” when the perform a check to see whether the string being recognized is “END.”. The scanner could, in effect, back up to State 2 (recognizing the ”VAR”). Figure 5.10(a) shows a typical algorithm to recognize such a token. Fig. 5.8(b), corresponding to State 1.

  28. Figure 5.10 Token recognition using (a) algorithmic code (b) tabular representation of finite automation

  29. Basic compiler Functions Syntactic analysis, language constructs described by the grammar being bottom-up and top-down. One bottom-up method and one top-down method, and show the application of these techniques to our example program.

  30. Operator-precedence parsing The operator precedence method. Multiplication and division are performed before addition and subtraction. + < * * > +

  31. Operator-precedence parsing A + B * C – D The expression B * C is to be computed before operations in the expression is performed. The * operation appears at a lower level than does either + or -. Interpreted in terms of the rules of the grammar. Reached, the analysis is complete.

  32. Operator-precedence parsing The precedence relations between the operators of the grammar. In this context, operator is token to mean any terminal symbol (i.e., any token), precedence relations involving tokens such as BEGIN. precedence relations do not follow the ordinary rules for comparisons. That is , when ; has higher precedence. But when the END has higher precedence.

  33. Operator-precedence parsing Precedence relation between a pair of tokens. Tokens cannot appear together in any legal statement. It should be recognized as a syntax error. Constructing a precedence matrix like Fig. 5.11 from a grammar [Aho et al. (1988)].

  34. Operator-precedence parsing The operator-precedence parsing statement from line 9 of the program in the Fig. 5.1, on token at a time. Each pair of operators has identified the portion of the statement delimited relations <and> to be interpreted in terms of the grammar. A <factor> according to Rule 12 of the grammar.

  35. Operator-precedence parsing Nonterminal symbol is being recognized; nonterminal <N1>, replaced by <N1>. The parse tree that corresponds to this interpretation appears to the right. Parser generally uses a stack that have been scanned but not yet parsed, the statement to be recognized. The parsing of the READ statement. Identified the syntax of the statement, which is the goal of the parsing process.

  36. Operator-precedence parsing Step-by-step parsing of the assignment statement from line 14 of the program in Fig. 5.1. The next portion of the statement to be recognized, the first portion delimited by <and>. Once this portion has been determined, some rule of the grammar. Fig. 5.3 the id SUMSQ is interpreted first as a <facor>, the single nonterminal <N1>.

  37. Operator-precedence parsing One of the earliest bottom-up parsing methods. The operator precedence technique were later developed into a more general method known as shift-reduce parsing, can be taken are shift (current token) and reduce. The parser shifts (pushing the current token onto the stack) when it encounters the token BEGIN. The shift action is also applied to the next three tokens, the reduce action is invoked. A set of tokens from the top of the stack is reduced, to be reduced later as part of the READ statement.

  38. Operator-precedence parsing Encounters the relations and . Reduce roughly corresponds to the action taken when an operator precedence parser encounter the relation .

  39. Operator-precedence parsing Top-down method known as recursive descent. A procedure is to find a substring of the input, beginning with the current token, interpreted as the nonterminal with token pointer past the substring it has just recognized. Examines the next two input tokens looking for READ and (these are found, the procedure for <read> then calls the procedure for <id-list>. The next input token, looking for). If all these procedure returns an indication of success.

  40. Operator-precedence parsing Several alternatives defined by the grammar for a nonterminal. The alternatives to try. For the next input token. The procedure for <id-list>, corresponding between its two alternatives since both id and <id-list> can begin with id. Immediate recursive call, which leads to an unending immediate left recursion.

  41. Operator-precedence parsing The terms between {and} may be omitted, or repeated one or more times. An id followed by zero or more occurrences of “, id”. Recursive-descent of the READ statement on the grammar in Fig. 5.15.

  42. Operator-precedence parsing A graphic representation of the recursive-descent parsing process for the statement being analyzed. In part (ii), READ has called IDLIST, which has examined the token id. READ has then examined the input token. Beginning at the root, hence the term top-down patsing.

  43. Operator-precedence parsing We describe involves a set of routines, the parser recognizes program according to some rule of the grammar, the corresponding routine is executed semantic routines. The corresponding construct in the language. Generation routines intermediate from of the program that would attempt to generate more efficient object code.

  44. Operator-precedence parsing The operator-precedence method ignores certain nonterminals. The generation of object code for a SIC/XE machine of two data structures storage: a list and a stack. Items inserted into the list are removed in the order, first in-first out. Items pushed onto the stack are removes in the opposite order, last in-first out. The name of the identifier, or a pointer to the symbol-stable entry is the value of the integer, such as #100.

  45. Operator-precedence parsing Segments of object code for the compiled program. LOCCTR is updated to reflect the next available address in the compiled program. The parse tree for this statement if repeated for convenience in Fig. 5.18(a). Substring of the input is nonterminal <N1>. Recursive-descent parse, the recognition occurs when a procedure returns to its caller, recognizes the id VALUE as an <id-list>, the complete statement as a <read>.

  46. Operator-precedence parsing Consists of a call to a subroutine of a standard library associated with the compiler called by any program that wants to READ operation. Since XREAD may be used to perform any READ operation, immediately after the JSUB that calls it. Value that specifies the number that will be assigned values by the READ. Address of these variables specifies that one variable is to be read.

More Related