150 likes | 280 Views
Regular expressions to fsms. hardware and software techniques Paul Cockshott. Extended regular grammars. Sequence abe Alternation a|b|E Charset [a-z] Zero or more x* One or more x+. Sequence abe Alternation a|b|E Charset [a-z] Zero or more x* One or more x+. Sequence Alternation
E N D
Regular expressions to fsms \hardware and software techniques Paul Cockshott
Extended regular grammars • Sequence abe • Alternation a|b|E • Charset [a-z] • Zero or more x* • One or more x+
Sequence abe Alternation a|b|E Charset [a-z] Zero or more x* One or more x+ Sequence Alternation -> a|b|c|d…. x+|ε x+ Reduced forms - ε is the null character
Map to state machines • Sequence abc 3 a 1 c 2 b
Map alternation A|x|p A p x
Map A+|b A A b
Hardware Interpreting machine code in a cpu Interpreting network addresses in a router chip Software Compilers Software routers Protocol analysers Hardware and software FSMs
PLA with latch Input char State latch Next state Action code clock Product lines Or plane And plane
AND plane ~a a ~b b a AND ~b b true and complement lines
Or plane P or q p p q
Advantage of PLA • Very fast – uses the minimum logic • Lends itself to logic minimisation • Efficient layout on silicon • Method of choice for parsing simple regular grammars at > cpu speeds in instruction decode units
instruction Fsm table + State sel State line Last char index Char col First char index reg Current char hit RAM based FSM Source data 8 bits 1 6bits 8 bits
Add char class map instruction Fsm table + State sel State line Last char index Char col First char index reg Current char hit Char class map
Advantages of char class map • Reduces the size of the FSM table. • If we have n states we would otherwise require 256n locations in table. With char class map we require c x n where c is the number of distinct character classes in the grammar.