240 likes | 410 Views
Course Overview. PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion
E N D
Course Overview PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion • Interpretation 9 Review Supplementary material: Theoretical foundations (Finite-state machines)
Finite State Machines (aka Finite Automata) • A FSM is similar to a compiler in that: • A compiler recognizes legal programsin some (source) language. • A finite-state machine recognizes legal stringsin some language. • Example: Pascal Identifiers • sequences of one or more letters or digits, starting with a letter: letter | digit letter S A
a Finite State Machines viewed as Graphs • A state • The start state • An accepting state • A transition
Finite State Machines • Transition s1 a >s2 • Is read In state s1 on input “a” go to state s2 • If end of input • If in accepting state => accept • Otherwise => reject • If no transition possible (got stuck) => reject
Language defined by FSM • The language defined by a FSM is the set of strings accepted by the FSM. • Are in the language of the FSM shown above: • x, mp2, XyZzy, position27. • Are not in the language of the FSM shown above: • 123, a?, 13apples.
Example: Integer Literals • FSM that accepts integer literals with an optional + or - sign: digit B digit digit + S A -
Formal Definition • Each finite state machine is a 5-tuple (, Q, , q, F) that consists of: • An input alphabet • A set of states Q • A start state q • A set of accepting states (or final states) F Q • is a state transition function: Q x Q that encodes transitions stateiinput> statej
State-Transition Function for the integer-literal example: (S, +) = A (S, –) = A (S, digit) = B (A, digit) = B (B, digit) = B
B FSM Examples 0 1 1 A 0 Accepts strings over alphabet {0,1} that end in 1
1 3 2 a b a b b b a a 4 5 b a FSM Examples Accepts strings over alphabet {a,b} that begin and end with same symbol
0 FSM Examples 0 Accepts strings over {0,1,2} such that sum of digits is a multiple of 3 1 Start 2 1 2 0 1 2 0 2 1
0 0 1 Odd Even 1 FSM Examples Accepts strings over {0,1} that have an odd number of ones
0,1 1 1 0 0 '001' '0' '00' 1 0 FSM Examples Accepts strings over {0,1} that contain the substring 001
Examples • Design a FSM to recognize strings with an equal number of ones and zeros. • Not possible • Design a FSM to recognize strings with an equal number of substrings "01" and "10". • Perhaps surprisingly, this is possible
FSM Examples 0 1 1 0 0 Accepts strings with an equal number of substrings "01" and "10" 1 0 1 0 1
TEST YOURSELF • Question 1: Draw a finite-state machine that accepts Java identifiers • one or more letters, digits, or underscores, starting with a letter or an underscore. • Question 2: Draw a finite-state machine that accepts only Java identifiers that do not end with an underscore
TEST YOURSELF Question 3: What strings does this FSM accept? Describe the set of accepted strings in English. 1 q0 q2 1 0 0 0 0 1 q1 q3 1
Two kinds of Finite State Machines Deterministic (DFSM): • No state has more than one outgoing edge with the same label. [All previous FSM were DFSM.] Non-deterministic (NFSM): • States may have more than one outgoing edge with same label. • Edges may be labeled with (epsilon), the empty string. [Note that some books use the symbol .] • The automaton can make an epsilon transition without consuming the current input character.
Example of NFSM • integer-literal example: digit B digit + S A -
'001' Example of NFSM 0,1 0,1 1 0 0 '0' '00' Accepts strings over {0,1} that contain the substring 001
Non–deterministic finite state machines (NFSM) • sometimes simpler than DFSM • can be in multiple states at the same time • NFSM accepts a string if • there exists a sequence of moves • starting in the start state, • ending in a final state, • that consumes the entire string. • Examples: • Consider the integer-literal NFSM on input "+752" • Consider the second NFSM on input "10110001"
Equivalence of DFSM and NFSM • Theorem: • For each non-deterministic finite state machine N, we can construct a deterministic finite state machine D such that N and D accept the same language. • [proof omitted] • Theorem: • Every deterministic finite state machine can be regarded as a non–deterministic finite state machine that just doesn’t use the extra non–deterministic capabilities.
How to Implement a FSM A table-driven approach: • Table: • one row for each state in the machine, and • one column for each possible character. • Table[j][k] • which state to go to from state j on input character k, • an empty entry corresponds to the machine getting stuck.
The table-driven program for a DFSM state = S // S is the start state repeat { k = next character from the input if (k == EOF) then // end of input if (state is a final state) then accept else reject state = T[state][k] if (state = empty) then reject // got stuck }