260 likes | 497 Views
Programming language syntax. Three aspects of languages Syntax How are sentences formed? Semantics What does a sentence mean? Pragmatics How to use the language? Only syntax can be described formally Regular expressions and context-free grammars. Regular expressions (RE).
E N D
Programming language syntax • Three aspects of languages • Syntax • How are sentences formed? • Semantics • What does a sentence mean? • Pragmatics • How to use the language? • Only syntax can be described formally • Regular expressions and context-free grammars by Neng-Fa Zhou
Regular expressions (RE) • The empty string e is a RE • Every symbol in S (alphabet) is a RE • Let r and s be REs. • r | s : or • rs : concatenation • (r)* : zero or more instances • (r)+ : one or more instances • (r)? : zero or one instance by Neng-Fa Zhou
Examples Precedence of operators all left associative high S = {a,b} 1. a|b 2. (a|b)(a|b) 3. a* 4. (a|b)* 5. a| a*b r* r+ r? rs low r|s by Neng-Fa Zhou
Algebraic Properties of RE by Neng-Fa Zhou
Regular Definitions d1 r1 d2 r2 di is a RE over S {d1,d2,...,di-1} .... dn rn not recursive by Neng-Fa Zhou
Examples • Identifiers • Decimal integers in Java • Hexadecimal integers letter A | B | ... | Z | a | b | ... | z digit 0 | 1 | ... | 9 idletter ( letter | digit )* DecimalNumeral 0 | nonZeroDigitdigit* HexaNumeral (0x | 0X) hexadigit+ by Neng-Fa Zhou
Lex • A tool for automatically generating lexical analyzers by Neng-Fa Zhou
Lex Specifications declarations %% translation rules %% auxiliary procedures p1 {action1} p2 {action2} ... pn {actionn} by Neng-Fa Zhou
Lex Regular Expressions by Neng-Fa Zhou
Example-1 %{ int num_lines = 0, num_chars = 0; %} %% \n ++num_lines; ++num_chars; . ++num_chars; %% main() { yylex(); printf( "# of lines = %d, # of chars = %d\n", num_lines, num_chars ); } yywrap(){return 0;} by Neng-Fa Zhou
Example-2 D [0-9] INT {D}{D}* %% {INT}("."{INT}((e|E)("+"|-)?{INT})?)? {printf("valid %s\n",yytext);} . {printf("unrecognized %s\n",yytext);} %% int main(int argc, char *argv[]){ ++argv, --argc; if (argc>0) yyin = fopen(argv[0],"r"); else yyin = stdin; yylex(); } yywrap(){return 0;} by Neng-Fa Zhou
java.util.regex import java.util.regex.*; class Number { public static void main(String[] args){ String regExNum = "\\d+(\\.\\d+((e|E)(\\+|-)?\\d+)?)?"; if (Pattern.matches(regExNum,args[0])) System.out.println("valid"); else System.out.println("invalid"); } } by Neng-Fa Zhou
String Pattern Matching in Perl print "Input a string :"; $_ = <STDIN>; chomp($_); if (/^[0-9]+(\.[0-9]+((e|E)(\+|-)?[0-9]+)?)?$/){ print "valid\n"; } else { print "invalid\n"; } by Neng-Fa Zhou
Context-free Grammars G=(S ,N,P,S) • S is a finite set of terminals • N is a finite set of non-terminals • P is a finite subset of production rules • S is the start symbol by Neng-Fa Zhou
CFG: Examples • Arithmetic expressions • Statements E T | E + T | E - T T F | T * F |T / F F id | (E) IfStatement if E then Statement else Statement by Neng-Fa Zhou
CFG vs. Regular Expressions • CFG is more expressive than RE • Every language that can be described by regular expressions can also be described by a CFG • Example languages that are CFG but not RE • if-then-else statement, {anbn | n>=1} • Non-CFG • L1={wcw | w is in (a|b)*} • L2={anbmcndm | n>=1 and m>=1} by Neng-Fa Zhou
Derivations aAb agb if A g * a a * * a b and b g then a g a is a sentential form a is a sentence if it contains only terminal symbols * S a by Neng-Fa Zhou
Derivations • leftmost derivation • Rightmost derivation aAb agb if a is a string of terminals aAb agb if b is a string of terminals by Neng-Fa Zhou
Parse Trees • A parse tree is any tree in which • The root is labeled with S • Each leaf is labeled with a token a or e • Each interior node is labeled by a nonterminal • If an interior node is labeled A and has children labeled X1,.. Xn, then A X1...Xn is a production. by Neng-Fa Zhou
Parse Trees and Derivations E E + E | E * E | E - E | - E | ( E ) | id by Neng-Fa Zhou
YACC %token DIGIT %% lines : lines expr '\n' {printf("%d\n",$2);} | lines '\n' | ; expr : expr '+' term {$$ = $1 + $3;} | expr '-' term {$$ = $1 - $3;} | term ; term : term '*' factor {$$ = $1 * $3;} | term '/' factor {$$ = $1 / $3;} | factor ; factor : '(' expr ')' {$$ = $2;} | DIGIT ; %% by Neng-Fa Zhou
DCG in PrologStrings with an equal number of 0’s and 1’s Prolog clauses • DCG :-table e/2. e(A, A). e(A, B) :- 'C'(A, 0, C), e(C, D), 'C'(D, 1, B). e(A, B) :- 'C'(A, 1, C), e(C, D), 'C'(D, 0, B). e(A, B) :- e(A, C), e(C, B). :-table e/2. e --> []. e --> [0],e,[1]. e --> [1],e,[0]. e --> e,e. by Neng-Fa Zhou