1 / 19

Functional Design and Programming

Functional Design and Programming. Lecture 10: Regular expressions and finite state machines. Literature. These notes Randal C. Nelson’s notes on finite automata and regular expressions: http://www.cs.rochester.edu/u/nelson/courses/csc_173/fa/. Exercises. Consider:

janae
Download Presentation

Functional Design and Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Functional Design and Programming Lecture 10: Regular expressions and finite state machines

  2. Literature • These notes • Randal C. Nelson’s notes on finite automata and regular expressions: http://www.cs.rochester.edu/u/nelson/courses/csc_173/fa/

  3. Exercises • Consider: • alphabetic identifiers in Standard ML; • file names that end with “.txt”; • words in a text that contain subsequence “s”, “e”, “c”, “r”, “e”, “t”; • comments in XML. • For each of the above: • Write a regular expression that generates the strings • Give a finite state automaton that recognizes the strings. • See home page for more exercises.

  4. Overview • Regular expressions • Finite state automata • Applications of finite automata and regular expressions • Regular expressions and context-free grammars • Construction of finite automata • Implementation of deterministic finite automata

  5. Regular expressions • Expressions that describe (possibly) infinite sets of strings. • Examples: • .*\.sml: strings ending with “.sml” • .*glei.*: strings containting substring “glei”.

  6. Regular expressions: Definition

  7. Derived regular expressions

  8. Finite State Machines • Finite state automaton: Description of abstract machine with a finite number of different states and transitions on input symbols between states. • Finite state transducer: Like finite state automaton, but additionally with output symbols on transitions.

  9. a q q’ Finite State Automata • A finite state automaton is a 5-tuple consisting of: • a finite set S of characters or symbols (alphabet), • a finite set Q of states, • a start state q0ÎQ, • a subset F Í Q of accepting (or final) states, • a set of transitions (q, a, q’) with q, q’ ÎQ, a ÎS, written

  10. Finite state automata • An FSM accepts a string s = a1a2...an if there are transitions ending in a final state: a1 a2 an q0 q’ final state

  11. Finite state automata... • An FSM recognizes the language (set of strings) L Í S*which it accepts. • An FSM is deterministic if no state more than 1 transition on any given symbol. • Theorem: The same classes of languages over S are definable (recognizable) by finite state machines, deterministic finite state machines and regular expressions.

  12. Applications of regular expressions and FSM’s • Text searching and processing • State-based protocols • Dialog/interaction control • Hardware verification • Protocol verification • Programming language processing • Natural language processing

  13. Regular expressions and context-free grammars • Regular expressions can be understood as restricted CFG’s:RE = CFG incl. * (Kleene closure) with no (mutual) recursive definitions of nonterminals. • Regular definitions: A sequence of definitions ri = Ri for variables ri such that Ri is regular expression with possible occurrences of r1,...,ri-1.

  14. Regular expressions and finite state automata • Regular expressions are often convenient methods of specifying a desired language. • Deterministic finite state machines are a good model for efficient implementation of recognizing the language.

  15. Construction of finite automata Regular expression Path algebra construction Thomson’s construction Nondeterministic finite state automaton (NFA) subset construction (trivial) Deterministic finite state automaton (DFA)

  16. Implementation of deterministic FSM’s (1) • Table-based implementation: • Represent states by indexes 0,...,n-1. • Represent characters by indexes 0,...,m-1 (e.g., m=256). • Represent transitions by two-dimensional vector (vector of vector of indices or 2-dimensional array/vector) T such that T(q,a)= SOME q’ if (q,a,q’) is a transition. [SML: Vector.sub(Vector.sub(T, q), a)]

  17. Implementation of deterministic FSM’s (2) • Sparse table-based implementation: • Represent states by indexes 0,...,n-1. • Represent transitions by one-dimensional vector of association lists such that lookup(Vector.sub(T, q), a) = SOME q’if T(q,a)=q’. • Optimization: Use a better data structure than association lists (e.g., hash tables, search trees).

  18. Implementation of deterministic FSM’s (3) • Functional implementation: • datatype S = STATE of (symbol -> S) * bool. • Represent states by by functions of type S. Transitions are represented as part of the state (first component). Whether a state is accepting or not is represented by the second component. • Execute a transition by applying the second component of the state: fun trans (STATE (t, f)) a = t a • Note: Intuitively trans corresponds to a curried version of T: Q * S -> Q. It can be implemented more efficiently than the uncurried version (in principle).

  19. Other problems • Minimize a DFA. • Decide whether two DFA’s are equivalent.

More Related