610 likes | 702 Views
12.1 Languages and Grammars. Define Vocabulary V (alphabet) Word (string) Empty string λ Free language V* A language over V a n , for a in V, n in N. Phrase-Structure Grammars. A phrase-structure grammar is a 4-tuple (V,T,S,P). Example. The Language Generated by a Grammar.
E N D
12.1 Languages and Grammars • Define • Vocabulary V (alphabet) • Word (string) • Empty string λ • Free language V* • A language over V • an, for a in V, n in N.
Phrase-Structure Grammars • A phrase-structure grammar is a 4-tuple (V,T,S,P)
The Language Generated by a Grammar • The language L(G) generated by a grammar G is the subset of T* consisting of all the strings of terminal symbols which can be generated from the start symbol S by applying a series of productions.
Example Generate the string aaabb from the start symbol of the grammar at right.
Solution • Note that the language L generated by a grammar consists of all strings of terminal symbols that are derivable from the start symbol S • In the above example,
Example • A grammar that generates the set is given at right • Show how the strings 012 and 001122 can be derived from the start symbol S
Example • Find a grammar that generates the set
Language Types • Type 0 – No restrictions • Type 1 – All productions are of the form w1w2, where length(w1) length(w2) and w1 contains a non-terminal • Type 2 – All productions are of the form Nw • Type 3 – All productions are of the form Aa or AaB
Context-Free Grammars • Type 2 languages (where all productions are of the form Nw) are called context-free grammars. • If a grammar is context-free, then all the words in the language have a parse tree, also called a derivation tree.
Example • Construct a parse tree for aaabb in the grammar at right.
Parsing • To parse is to discover the form of the derivation tree • Top-down parsing starts at the root, with the start symbol, and constructs the tree from there • Bottom-up parsing builds the tree up from the leaves, by grouping symbols into valid right-hand sides of productions
Backus-Naur Form (BNF) • Backus-Naur Form, or BNF, is a notation for describing grammars, first used in the description of the syntax of the ALGOL 60 language • Non-terminal symbols are descriptive terms enclosed in angular brackets (e.g. <sentence> instead of S) • The symbol used for is ::= • Alternative right-hand sides can be combined using a vertical slash (|) to separate them
Example of a Portion of a Simple BNF Grammar <statement> ::= <if statement> | <while statement> | <expression> <expression> ::= <term> | <expression> + <term> <term> ::= <factor> | <term> * <factor> <factor> ::= <identifier> | <literal> | ( <expression> )
Parse Tree for 12*(2+x) <statement> ::= <if statement> | <while statement> | <expression> <expression> ::= <term> | <expression> + <term> <term> ::= <factor> | <term> * <factor> <factor> ::= <identifier> | <literal> | ( <expression> )
12.2 Finite State Machines with Output • A finite state machine with output is a 6-tuple (S, I, O, f, g, s0), where S is a finite set of states, I is a finite set of input symbols, O is a finite set of output symbols, f is a state transition function, g is an output function, and s0 is a designated state from the set S, called the start state.
Extending the Output Function • Recall that the output function g maps a (state, input symbol) pair to a single output symbol • We can extend that function so that it maps an input string to a string of output symbols. • For instance, in the above example g(01100100) =
Example • Input alphabet is {a, b}. Output a 1 if the input string began with bab, output 0 otherwise.
More Examples • Machine to reverse bits • Machine to output 1 for an odd number of 1’s, 0 otherwise
Vending Machine Example • Machine takes quarters only. • Outputs a coke if 75 cents has been entered and the vend button has been pressed (use output symbol c) • Outputs a quarter if 75 cents has already been entered (use output symbol 25) • No output otherwise (use output symbol n)
12.3 Finite State Machines with No Output • Concatenation of sets of strings • Powers An of a set of symbols A • Kleene Closure A* of a set of strings A
Examples: • Let • Find:
Deterministic Finite State Automata • A deterministic finite state automaton (DFSA) is a 5-tuple M = (S, I, f, s0 , F), where S is a finite set of states, I is an input alphabet, f:SIS is a state transition function, and F is a subset of S called the set of final states. • The language accepted by Mis the set of strings in I* which, if given as input to M, will leave the machine in a final state.
Examples • Construct a machine with input alphabet {0, 1} which accepts any string with an even number of 1’s • Construct a machine which accepts {0,1}*.
Example • What language is accepted by the following machine?
Example • What language is accepted by the following machine?
Non-Deterministic Finite-State Automata • A non-deterministic finite-state automaton (NFSA) is defined the same as a dfsa, except that the state transition function maps each (state, input symbol) pair to a set of states; i.e. f:SIP(S). • If f(s,a) is {s',s"}, then the machine in state s could, if its input is a, either transition to state s' or to state s". • If f(s,a) is {}(the empty set), then the machine in state s would consider a an invalid input symbol. • The language accepted by an NFSA is the set of all input strings which could put the machine in a final state if the right choices were made at each step in the execution of the machine.
Example • What language is accepted by the following NFSA?
Constructing a DFSA from an NFSA • Any NFSA can be transformed into a DFSA by using sets of states as the states in the new machine. • The start state of the new machine is {s0}. • The transition function g of the new machine can be defined in terms of the original function f and sets of states. • A set representing a state in the new machine is considered final if any state in the set is final. • Example: Transform the machine on the previous slide into a DFSA.
Note that the language accepted by the original NFSA is the same as that accepted by the DFSA constructed from it. • This can be expressed as a theorem about languages…
12.4 Language Recognition • Let I be a finite set of input symbols. The “regular expressions over I” are the symbols in I along with the empty string λ, the empty set , and any expression that can be formed from those elements using concatenation, union, and Kleene closure • Regular expressions are used to generate the category of languages called “regular languages”
Regular Languages • Regular expressions always describe sets, and when a symbol, say a, is used in a regular expression it stands for {a}. • Thus (a ab)* is the Kleene closure of the set {a, ab}. • A language generated in this way by a regular expression is called a regular language. • Examples: Give a regular expression describing… • The set {α | α is λ or α is all zeros or α is all ones} • the set {α | α is λ or α is an alternating sequence of 0’s and 1’s}
What are the Regular Languages? • Kleene’s Theorem: The regular languages are precisely those languages which can be recognized by FSA’s. • Draw a DFSA which recognizes the last example of a regular language.
Regular Grammars • Recall that in a regular grammar, all the productions are of the form Aa or AaB, where A and B are non-terminals and a is a terminal symbol. • A regular language has the property that all of its non-empty strings can be generated by the productions of a regular grammar
Example: • Find a regular grammar capable of generating the nonempty strings recognized by the FSA shown above
Which Languages are not Regular? • Example: {0n1n | n 0} is not regular
12.5 Turing Machines • A Turing Machine (TM) is a quadruple (S, I, f, s0), where S is a set of states, I is a set of input/output symbols, f is a transition function, and s0 is an initial state. A special symbol B, called the blank symbol, is always an element of I. • The transition function is actually a partial function, since its domain is a subset of SI, not the complete set. But for every pair (s,i) for which f(s,i) is defined, f(s,i) will be a triple (s′,i′,D), where s′ is a state, i′ is an ouput symbol, and D is either the symbol R (for right) or the symbol L (for left).
Picture A Turing Machine is conceptualized as taking its input from and writing its output to a discrete tape (actually a sequence of storage cells) that is infinite in both directions. All but a finite number of cells contain the blank symbol B. By convention, the initial position of a TM is with the read head of the machine positioned over the leftmost non-blank symbol. At each step of the machine’s operation, it reads the input symbol from the tape, changes its state, writes a symbol back into the same cell, then moves either left or right.
Example • States: • s0, start state – moving right, looking for 1’s • s1, moving right, looking for 1’s, found 1 • s2, moving right, looking for 1’s, found 2 • s3, moving right, looking for 1’s, found 3 • s4, found four 1’s.
Example • States: • s0, start state – moving right, looking for 1’s • s1, moving right, looking for 1’s, found 1 • s2, moving right, looking for 1’s, found 2 • s3, moving right, looking for 1’s, found 3 • s4, found four 1’s.
Example • States: • s0, start state – moving right, looking for 1’s • s1, moving right, looking for 1’s, found 1 • s2, moving right, looking for 1’s, found 2 • s3, moving right, looking for 1’s, found 3 • s4, found four 1’s.
Example • States: • s0, start state – moving right, looking for 1’s • s1, moving right, looking for 1’s, found 1 • s2, moving right, looking for 1’s, found 2 • s3, moving right, looking for 1’s, found 3 • s4, found four 1’s.
Example • States: • s0, start state – moving right, looking for 1’s • s1, moving right, looking for 1’s, found 1 • s2, moving right, looking for 1’s, found 2 • s3, moving right, looking for 1’s, found 3 • s4, found four 1’s.
Example • States: • s0, start state – moving right, looking for 1’s • s1, moving right, looking for 1’s, found 1 • s2, moving right, looking for 1’s, found 2 • s3, moving right, looking for 1’s, found 3 • s4, found four 1’s.
Example • States: • s0, start state – moving right, looking for 1’s • s1, moving right, looking for 1’s, found 1 • s2, moving right, looking for 1’s, found 2 • s3, moving right, looking for 1’s, found 3 • s4, found four 1’s.
Example • States: • s0, start state – moving right, looking for 1’s • s1, moving right, looking for 1’s, found 1 • s2, moving right, looking for 1’s, found 2 • s3, moving right, looking for 1’s, found 3 • s4, found four 1’s.
Transition Function • The transition function for this TM is given below:
Final States • A state si is said to be a final state if there are no actions defined for that state, i.e. if there are no pairs of the form (si, x) for which the transition function is defined. • The language recognized by a Turing Machine T is the set of all strings of input symbols which, if placed on the tape with the read head at the initial position, will cause the machine to eventually transition into a final state.