180 likes | 269 Views
Chapter 4: Regular Expressions. A new method to define languages alphabet ïƒ language S = {x} S* = { Λ , x, xx, xxx, …} or directly {x}* = { Λ , x, xx, xxx, …} language ïƒ language S = {xx, xxx} S * = { Λ , xx, xxx, xxxx, …}
E N D
Chapter 4: Regular Expressions A new method to define languages • alphabet language • S = {x} S* = {Λ, x, xx, xxx, …} • or directly {x}* = {Λ, x, xx, xxx, …} • language language • S = {xx, xxx} S* = {Λ, xx, xxx, xxxx, …} • or directly {xx, xxx}* = {Λ, xx, xxx, xxxx, …} • “letter” language • x* (written in bold) • language(x*) = {Λ, x, xx, xxx, …} • or informally x* = {Λ, x, xx, xxx, …} Zaguia/Stojmenovic
Chapter 4: Regular Expressions • L1 = {a, ab, abb, abbb, …} or simply (ab*) • L2 = {Λ, ab, abab, ababab, …} or simply (ab)* Several ways to express the same language • {x, xx, xxx, xxxx, …} xx* x+ xx*x* x*xx* (x+)x* x*(x+) x*x*xx* • L3= {Λ, a, b, aa, ab, bb, aaa, aab, abb, bbb, aaaa, …} or simply (a*b*) (a’s before b’s) Remark: language(a*b*) language((ab)*) Zaguia/Stojmenovic
Chapter 4: Regular Expressions Example: S-ODD • Rule 1: xS-ODD • Rule 2: If w is in S-ODD then xxw is in S-ODD • S-ODD = language(x(xx)*) • S-ODD = language((xx)*x) • But not: S-ODD = language(x*xx*) xx|x|x Zaguia/Stojmenovic
Chapter 4: Regular Expressions • A useful symbol to simplify the writing: • x + y choose either x or y • Example: S = {a, b, c} T = {a, c, ab, cb, abb, cbb, abbb, cbbb, …} T = language((a+c)b*) (defines the language whose words are constructed from either a or c followed by some b’s) Zaguia/Stojmenovic
Chapter 4: Regular Expressions • L = {aaa, aab, aba, abb, baa, bab, bba, bbb} all words of exactly three letters from the alphabet {a, b} L = (a+b)(a+b)(a+b) • (a+b)* all words formed from alphabet {a,b} • a(a+b)* = ? • a(a+b)*b = ? Zaguia/Stojmenovic
Chapter 4: Regular Expressions • Definition: Given an alphabet S, the set of regular expressions is defined by the following rules. • For every letter in S, the letter written in bold is a regular expression. Λ is a regular expression. • If r1 and r2 are regular expressions, then so are: • (r1) • r1 r2 • r1+r2 • r1* • Nothing else is a regular expression. Zaguia/Stojmenovic
Chapter 4: Regular Expressions • Remark: Notice thatr1+ = r1r1* • r1=r2 if and only if language(r1) = language(r2) • Example: (a+b)*a(a+b)* All words that have at least one a. abbaab: (Λ)a(bbaab) (abb)a(ab) (abba)a(b) • Words with no a’s? b* • All words formed from {a,b}? (a+b)*a(a+b)* + b* Thus: (a+b)* = (a+b)*a(a+b)* + b* Zaguia/Stojmenovic
Chapter 4: Regular Expressions • Example: The language of all words that have at least two a’s. (a+b)*a(a+b)*a(a+b)* = b*ab*a(a+b)* = (a+b)*ab*ab* = b*a(a+b)*ab* • Example: The language of all words that have exactly two a’s. b*ab*ab* Zaguia/Stojmenovic
Chapter 4: Regular Expressions Another Example: At least one a and one b? • First solution: (a+b)*a(a+b)*b(a+b)* + (a+b)*b(a+b)*a(a+b)* • But (a+b)*a(a+b)*b(a+b)* expresses all words except words of the form some b’s (at least one) followed by some a’s (at least one). bb*aa* • Second solution: (a+b)*a(a+b)*b(a+b)* + bb*aa* • Thus: (a+b)*a(a+b)*b(a+b)* + (a+b)*b(a+b)*a(a+b)* = (a+b)*a(a+b)*b(a+b)* + bb*aa* Zaguia/Stojmenovic
Chapter 4: Regular Expressions • The only words that do not contain both an a and b in them are the words formed from all a’s or all b’s: a*+b* • Thus: (a+b)* = (a+b)*a(a+b)*b(a+b)* + bb*aa* + a* + b* Zaguia/Stojmenovic
Chapter 4: Regular Expressions • Example: The language of all words formed from some b’s (possibly 0) and all words where an a is followed by some b’s (possibly 0): {Λ, a, b, ab, bb, abb, bbb, abbb, bbbb, …} b* + ab* (Λ + a)b* • In general: concatenation is distributive over the + operation. r1(r2+r3) = r1r2 +r1r3 (r1+r2) r3= r1r3 +r2r3 Zaguia/Stojmenovic
Chapter 4: Regular Expressions • Example of the distributivity rule:(a+c)b* = ab*+cb* • 2 operations: language(s) language If S and T are two languages from the same alphabet S, • S+T: the union of languages S and T defined as ST • ST: the product set is the set of words x written vw with v a word in S and w a word in T. • Example: S = {a, bb} T = {a, ab} ST = {aa, aab, bba, bbab} Zaguia/Stojmenovic
Chapter 4: Regular Expressions Language associated with a regular expression is defined by the following rules. • The language associated with a regular expression that is just a single letter is that one-letter word alone. The language associated with Λ is {Λ}. • If L1 is the language associated with the regular expression r1 and L2 is the language associated with the regular expression r2: (i) The product L1L2 is the language associated with the regular expression r1r2, that is: language(r1r2) = L1L2 (ii) The union L1+L2 is the language associated with the regular expression r1+r2, that is: language(r+r2) = L1+L2 (iii) The Kleene closure of L1, written L1*, is the language associated with the regular expression r1*, that is language(r 1*) = L1* Zaguia/Stojmenovic
Chapter 4: Regular Expressions • Remark: For all regular expressions, there is some language associated with it. • Finite Languages are Regular • Let L be a finite language. There is a regular expression that defines it. • Algorithm (and proof) Write each letter in L in bold, and write a + between regular expressions Zaguia/Stojmenovic
Chapter 4: Regular Expressions Example: L = {baa, abbba, bababa} baa + abbba + bababa • The regular expression that is defined by this algorithm is not necessarily unique. Example: L = {aa, ab, ba, bb} aa + ab + ba + bb or (a+b)(a+b) • Remark: This algorithm does not work for infinite languages. Regular expressions must be finite, even if the language defined is infinite. Zaguia/Stojmenovic
Chapter 4: Regular Expressions • Kleene star applied to a subexpression with a star (a+b*)* (aa+ab*)* (a+b*)* = (a+b)* (aa+ab*)* (aa+ab)* abb|abb • (a*b*)* The letter a and the letter b are in language(a*b*). (a*b*)* = (a+b)* • Is it possible to determine if two regular expressions are equivalent? • With a set of algebraic rules? Unknown. • With an algorithm? Yes. Zaguia/Stojmenovic
Chapter 4: Regular Expressions • Examples • Words with a double letter: (a+b)*(aa+bb)(a+b)* • Words without a double letter: (ab)* But not words that begin with b or end with a: (Λ+b)(ab)*(Λ+a) • (a+b)*(aa+bb)(a+b)* + (Λ+b)(ab)*(Λ +a) Zaguia/Stojmenovic
Chapter 4: Regular Expressions Language EVEN-EVEN defined by the expression: [aa + bb + (ab + ba)(aa+bb)*(ab + ba)]* Every word in EVEN-EVEN has an even number of a’s and b’s. Every word that contains an even number of a’s and b’s is a member of EVEN-EVEN. Zaguia/Stojmenovic