Chapter 3 Regular Languages and Regular Grammars

Chapter 3Regular Languages andRegular Grammars

3.1: Regular Expressions (1) • Regular Expression (RE): • E is a regular expression over  if E is one of: • e • a, where a   • If r and s are regular expressions (REs), then the following expressions also regular: • r | s (r or s) • rs  (r followed by s) • r* (r repeated zero or more times) • Each RE has an equivalent regular language (RL)

3.1: Regular Expressions (2) • Regular Language (RL): • L is a regular language over  if L is one of: •  empty set • {e} a set that contains empty string • {a} where a   • If R and S are regular languages (RL), then the following languages also regular: • R  S = {w | w  R or w  S} • RS = {rs | r  R and s  S} • R* = R0  R1  R2  R3  …

3.1: Regular Expressions (3) precedence • Rules for Specifying Regular Expressions: • e is a regular expression  L = {e} • If a is in , a is a regular expression  L = {a}, the set containing the string a. • Let r and s be regular expressions with languages L(r) and L(s). Then • a.r | s is a RE  L(r)  L(s) • b.rs is a RE  L(r) L(s) • c.r* is a RE  (L(r))* • d. (r) is a RE  L(r), extra parenthesis

3.1: Regular Expressions (4) • Examples: • {0, 1}{00, 11} = {000, 011, 100, 111} • {0}* = {e, 0, 00, 000, 0000, …} • {e}* = {e} • {10, 01}* = {e, 10, 1010, 101010, …, 01, 0101, 010101, …, 1001, 100101, 10010101, …, 0110, 011010, 01101010, …} • Notational shorthand: • L0 = {e} • Li = LLi-1 • L+ = LL*

3.1: Regular Expressions (5) • Let L be a language over {a, b}, each string in L contains the substring bb • L = {a, b}*{bb}{a, b}* • L is regular language (RL). Why? • {a} and {b} are RLs • {a, b} is RL • {a, b}* is RL • {b}{b} = {bb} is also RL • Then L = {a, b}*{bb}{a, b}* is RL

3.1: Regular Expressions (6) • Let L be a language over {a, b}, each string in L • begins and ends with an a • contains at least one b • L = {a}{a, b}*{b}{a, b}*{a} • L = a*b*a • L is regular language (RL). Why? • {a} and {b} are RLs • {a, b}* is RL • Then L = {a}{a, b}*{b}{a, b}*{a} is RL

3.1: Regular Expressions (7) • L = {a, b}*{bb}{a, b}* • RE = (a|b)*bb(a|b)* • L = {a}{a, b}*{b}{a, b}*{a} • RE = a(a|b)*b(a|b)*a • This RE = (a)|((b)*(c)) is equivalent to a|b*c • We say REs r and s are equivalent (r=s), iff r and s represent the same language • Example: r = a|b, s = b|a r = s Why? • Since L(r) = L(s) = {a, b}

3.1: Regular Expressions (8) • Let = {a, b} • RE a|b L = {a, b} • RE (a|b)(a|b)  L = {aa, ab, ba, bb} • RE aa|ab|ba|bb same as above • RE a*  L = {, a , aa, aaa, …} • RE (a|b)*  L = set of all strings of a’s and b’s including  • RE (a*b*)* same as above • RE a|a*b L = {a,b,ab,aab,aaab, …}

3.1: Regular Expressions (9) • Algebraic Properties of regular Expressions AXIOM r | s = s | r r | (s | t) = (r | s) | t (r s) t = r (s t) r ( s | t ) = r s | r t ( s | t ) r = s r | t r r = r  = r r* = (r*)* = ( r |  )+ = r+ |  r** = r* r+ = r r*

3.1: Regular Expressions (10) abc • concatenation (“followed by”) a | b | c • alternation (“or”) * • zero or more occurrences + • one or more occurrences

3.1: Regular Expressions (11) • All strings of 1s and 0s (0 | 1)* • All strings of 1s and 0s beginning with a 1 1 (0 | 1)*

3.1: Regular Expressions (12) • All strings containing two or more 0s (1|0)*0(1|0)*0(1|0)* • All strings containing an even number of 0s (1*01*01*)* | 1*

3.1: Regular Expressions (13) • All strings containing an even number of 0s and even number of 1s ( 0 0 | 1 1 )*(( 0 1 | 1 0 )( 0 0 | 1 1 )*( 0 1 | 1 0 )( 0 0 | 1 1 )*)* • All strings of alternating 0s and 1s (  | 1 ) ( 0 1 )* (  | 0 )

3.1: Regular Expressions (14) • Strings over the alphabet {0, 1} with noconsecutive 0's • (1 | 01 )* (0 | ) • 1*(01+)* (0 | ) • 1*(011*)* (0 | ) • Strings over the alphabet {a, b} with exactly three b's • a*ba*ba*ba* • Strings over the alphabet {a, b, c} containing (at least once) bc • (a|b|c)*bc(a|b|c)*

3.1: Regular Expressions (15) • Strings over the alphabet {a, b} in which substrings ab and ba occur an unequal number of times • (a+b+)+ | (b+a+)+

3.1: Regular Expressions (16) • Describe the following in English: • (0|1)* • all strings over {0, 1} • b*ab*ab*ab* • all strings over {a, b} with exactly 3 a’s

3.1: Regular Expressions (17) • Examples of RE: • 01* • {0, 01, 011, 0111, …..} • (01*)(01) • {001, 0101, 01101, 011101, …..} • (0 | 1)* = {0, 1, 00, 01, 10, 11, …..} • i.e., all strings of 0 and 1 • (0 | 1)* 00 (0 | 1)* = {00, 1001, …..} • i.e., all 0 and 1 strings containing a “00”

3.1: Regular Expressions (18) • More Examples of RE: • (1 | 10)* • all strings starting with “1” and containing no “00” • (0 | 1)*011 • all strings ending with “011” • 0*1* • all strings with no “0” after “1” • 00*11* • all strings with at least one “0” and one “1”, and no “0” after “1”

3.1: Regular Expressions (19) • What languages do the following RE represent? • (1 | 01 | 001)*( | 0 | 00) • ((0 | 1)(0 | 1))* | ((0 | 1)(0 | 1)(0 | 1))*

3.1: Regular Expressions (20) • Home Study: • Construct a RE over ={0,1} such that • It does not contain any string with two consecutive “0”s • It has no prefix with two or more “0”s than “1” nor two or more “1”s than “0” • The set of all strings ending with “00” • The set of all strings with 3 consecutive 0’s • The set of all strings beginning with “1”, which when interpreted as a binary no., is divisible by 5 • The set of all strings with a “1” at the 5th position from the right • The set of all strings not containing 101 as a sub-string

3.1: Regular Expressions (21) • Strings over the alphabet {(, )} where the parentheses are balanced • Not regular • L = {anbn | n 0} • L = {e, ab, aabb, aaabbb, aaaabbbb, …} • Not regular

3.2: Connection Between RE & RL (1) • A language L is called regular if and only if there exists some DFA M such that L = L(M). • Since a DFA has an equivalent NFA, then • A language L is called regular if and only if there exists some NFA N such that L = L(N). • If we have a RE r, we can construct an NFA that accept L(r).

3.2: Connection Between RE & RL (2) 0. For  in the regular expression, construct NFA start L = { } =  1. For  in the regular expression, construct NFA  start L = {e} 2. For a   in the regular expression, construct NFA a start L = {a}

3.2: Connection Between RE & RL (3) 3.(a) If s and t are regular expressions, Ms and Mt are their NFAs. s|t has NFA: L = {L(Ms)  L(Mt)} Ms e e start i f e e Mt where i and f are new start / final states, and e-moves are introduced from i to the old start states of Ms and Mt as well as from all of their final states to f.

3.2: Connection Between RE & RL (4) start Ms Mt Mt Ms i i f f start e e e 3.(b) If s and t are regular expressions, Ms, Mt their NFAs. st (concatenation) has NFA: L = {L(Ms)L(Mt)} overlap Alternative: where i is the start state of Ms (or new under the alternative) and f is the final state of Mt (or new). Overlap maps final states of Ms to start state of Mt

3.2: Connection Between RE & RL (5) start e e Ms i f 3.(c) If s is a regular expressions and Ms its NFA, s* (Kleene star) has NFA: L = {L(Ms)*} e e where : i is new start state and f is new final state e-move i to f (to accept null string) e-moves i to old start, old final(s) to f e-move old final to old start (WHY?)

3.2: Connection Between RE & RL (6) a b start start • Build an NFA-e that accepts (a|b)*ba a b ba a|b b a start q1 e a e e start e b e

3.2: Connection Between RE & RL (7) • Build an NFA-e that accepts (a|b)*ba (a|b)* e a e e e e e b e e

3.2: Connection Between RE & RL (8) • Build an NFA-e that accepts (a|b)*ba e a e e e e e b e e e b a e

3.2: Connection Between RE & RL (9) r13 r5 | r12 r3 r4 r11 r10 ) ( a a r9 r1 r2 r7 r8 | r0 c * r6 * b b c (ab*c) | (a(b|c*)) Decomposition for this regular expression: What is the NFA? Let’s construct it !

3.2: Connection Between RE & RL (10) r3: r0: r2: a b c e e b e r1: e r4 : r1 r2 e e e b e c a e b e c r5 : r3 r4 e e

3.2: Connection Between RE & RL (11) r7: b e e c e r11: a r8: e r6: c b e e e e c e r9 : r7 | r8 e r10 : r9 b e e e a e c e r12 : r11 r10 e

3.2: Connection Between RE & RL (12) r13 : r5 | r12 e a e b e c 2 3 4 5 6 7 e e e 17 1 b 10 11 e e e e e a e c e 8 9 12 13 14 15 16 e

3.2: Connection Between RE & RL (13) a b c S1 S0 S1 S0 S1 S0 b S2 S1 S0 S5 c S4 S3  b S3 S2   S0 S1 S7 S6 c   S5 S4  Let’s try a ( b | c )* 1. a, b, & c 2. b | c 3. ( b | c )*      

3.2: Connection Between RE & RL (14)  b S5 S4   a  S1 S0 S2 S3 S9 S8 c   S7 S6  b | c a S1 S0 4. a ( b | c )*  

3.2: Connection Between RE & RL (15) NFA’s : start a 1 2 start a b b 3 4 5 6 a b start 7 8 b Let : a abb a*b+ 3 patterns

3.2: Connection Between RE & RL (16) 1 0 NFA for : a | abb | a*b+  a 2  a b b start 3 4 5 6 a b  7 8 b

Regular Expression to NFA-e (a | ba)*a

First Parsing Step concatenate (a|ba)* a

Second Parsing Step concatenate * a a|ba

Third Parsing Step concatenate * a | a ba

Fourth Parsing Step concatenate * a | a concatenate b a

Identify Leaf Nodes concatenate + a | a concatenate b a

Convert Leaf Nodes a a a b concatenate * | concatenate

Identify Convertible Node(s) a a a b concatenate * | concatenate

Convert Node a a b a concatenate * | e

Identify Convertible Node a a b a concatenate * | e

Convert Node a a a concatenate * e e b e

Identify Convertible Node a a a concatenate * e e b e

Chapter 3 Regular Languages and Regular Grammars

Chapter 3 Regular Languages and Regular Grammars

Presentation Transcript

Regular Languages

Languages, grammars, and regular expressions

Regular Grammars

Languages, Grammars, and Regular Expressions

Regular expressions Regular languages

REGULAR LANGUAGES

Regular Languages and Regular Expressions

Regular Languages and Grammars

Regular Languages

Regular Grammars

Regular Grammars

Regular Grammars

Regular Languages

Regular Expressions and Regular Languages

Chapter 3 Regular Expressions and Languages

Regular Grammars

Regular Grammars

Regular Grammars

Regular Languages and Regular Grammars

Regular Languages

Chapter 3 Regular languages and expressions

3. Regular Expressions and Languages