480 likes | 1.33k Views
CSC 305: PROGRAMMING PARADIGM. CHAPTER 2: Introduction to Language, Syntax and Semantics. Contents. Describing languages Sentences, Language, Lexeme and Tokens) Describing Syntax Language Recognizers, Generators, Grammars Describing Semantics Operational, Axiomatic, Denotation.
E N D
CSC 305: PROGRAMMINGPARADIGM CHAPTER 2: Introduction to Language, Syntax and Semantics
Contents • Describing languages • Sentences, Language, Lexeme and Tokens) • Describing Syntax • Language Recognizers, Generators, Grammars • Describing Semantics • Operational, Axiomatic, Denotation
Group Work 1.0 (Chapter1) • Find ONE programming language for each of the paradigm (Imperative, Object-oriented, Logic, Functional). • Explain the language overview and design process for each of the language. • Present your finding during the next class.
Languages • Method of communication • Spoken and written languages can be described as a system of symbols (sometimes known as lexemes) and the grammars (rules) by which the symbols are manipulated. • .. is a set of Sentences • Sentences is a string of characters over some alphabet
Programming language is .. ..a system of signs used to communicate a task/algorithm to a computer, causing the task to be performed. The task to be performed is called computation , which follows absolutely precise and unambiguous rules.
Syntax and Semantics • Syntax is the form of its expressions, statements and program units. • Semantics is the meaning of those expressions, statements and program units. • Example : while statement in Java while (boolean_expression) <statement>
Programming Language Example of a program that adds two integers and prints: 1 + 1 = 2 #include <stdio.h> int add(int x, int y) { return x + y; } int main(void) { int foo = 1, bar = 1; printf("%d + %d = %d\n", foo, bar, add(foo, bar)); return 0; } Syntax for function add ( ) Syntax for main () function
Lexeme • ..is the lowest level syntax unit of language • Include identifiers, literals, operator and special words Example : • { • return • x • + • y • ; • } { return x + y; } Lexeme
Token ..is a category of lexemes Example : • { • return • x • + • y • ; • } • open • keyword • identifier • plus op • identifier • separator • close { return x + y; } Lexeme Tokens
Language Recognizers • Determines whether given programs are in the language and syntactically correct. • Example : Compiler • Syntax analyzer is part of compiler. • Also known as parser.
Compiler Program that converts entire source program into machine language before executing it
Compiler process Source code Lexical Analyzer Tokenized code Syntactic Analyzer Parsedcode Semantic Analyzer Qualified code Code Generator Object code Final code optimizer
Interpreter Program that translates and executes one program code statement at a time Does not produce object program
Language Generators • To generate the sentences of a language • Comparing with the structure of the generator. • Formal methods for describing syntax is: • Grammars
Grammars • Describe the syntax of programming language. • Backus-Naur Form and Context-Free • Develop by Noam Chomsky and John Backus • Grammar classes : • Context-free grammars – Whole PL • Regular grammars – Tokens of PL
Context-Free grammars • Context-free grammars are powerful enough to describe the syntax of most programming languages • The syntax of most programming languages is specified using context-free grammars. • Context-free grammars are simple enough to allow the construction of efficient parsing algorithms which, for a given string, determine whether and how it can be generated from the grammar. • BNF (Backus-Naur Form) is the most common notation used to express context-free grammars.
Regular grammars • Is a formal grammars. • The two main categories of formal grammar: • generative grammars, which are sets of rules for how strings in a language can be generated • analytic grammars, which are sets of rules for how a string can be analyzed to determine whether it is a member of the language.
Classification of grammars • Chomsky (1959) hierarchy consists of following : • Type 0 grammar (unrestricted) • Type 1 grammar (context-sensitive) • Type 2 grammar (context free grammar) • Type 3 grammar (regular)
Type 0 grammar (unrestricted) An unrestricted grammar is a formal grammar G = (N,Σ,P,S), where N is a set of nonterminal symbols Σ is a set of terminal symbols, where N and Σ are disjoint, P is a set of production rules of the form where α and β are strings of symbols in and α is not the empty string, and is a specially designated start symbol. As the name implies, there are no real restrictions on the types of production rules that unrestricted grammars can have. • They generate exactly all languages that can be recognized by a Turing machine. • These languages are also known as the recursively enumerable languages.
Type 1 grammar (context-sensitive) • A formal grammar G = (N, Σ, P, S) is context-sensitive if all rules in P are of the form • αAβ → αγβ • The name context-sensitive is explained by the α and β that form the context of A and determine whether A can be replaced with γ or not. This is different from a context-free grammar where the context of a nonterminal is not taken into consideration. • Generated the context sensitive languages.
Type 2 grammar (context free grammar) • Generated the context free languages. • Context free languages are the theoretical basis for the syntax of most PL. • A context-free grammar G can be defined as a 4-tuple: • G = (Vt,Vn,P,S) where • Vt is a finite set of terminals • Vn is a finite set of non-terminals • P is a finite set of production rules • S is an element of Vn, the distinguished starting non-terminal. • elements of P are of the form • Example : • S → x | y | z | S + S | S - S | S * S | S/S | (S) • This grammar can, for example, generate the string "( x + y ) * x - z * y / ( x + x )".
Type 3 grammar (regular) In computer science a right regular grammar is a formal grammar (N, Σ, P, S) such that all the production rules in P are of one of the following forms: A → a - where A is a non-terminal in N and a is a terminal in Σ A → aB - where A and B are in N and a is in Σ A → ε - where A is in N and ε denotes the empty string, i.e. the string of length 0. In a left regular grammar, all rules obey the forms A → a - where A is a non-terminal in N and a is a terminal in Σ A → Ba - where A and B are in N and a is in Σ A → ε - where A is in N and ε is the empty string.
Grammar <program> begin <stmt_list> end <stmt_list> <stmt> | <stmt>;<stmt_list> <stmt> <var> = <expression> <var> A | B| C <expression> <var> + <var> | <var> - <var> | <var> A program consist of the special word begin followed by a list of statements separated by semicolons followed by the special word end An expression is either single or two variables separated by either + or – operator. The only variable name is A, B and C
Grammar Example A = B * (A + C) <assign> => <id> = <expr> => A = <expr> => A = <id> * <expr> => A = B * <expr> => A = B * (<expr>) => A = B * ( <id> + <expr>) => A = B * ( A + <expr>) => A = B * ( A + <id>) => A = B * ( A + C )
BNF • Invented by Noam Chomsky and John Backus. • A BNF specification is a set of derivation rules. • Context free grammars • The whole programming language is context free grammars. • Fundamentals: • BNF is a metalanguage
Example of BNF <postal-address> ::= <name-part> <street-address> <zip-part> This translates into English as: A postal address consists of a name-part, followed by a street address part, followed by a zip-code part.
Example of BNF <street-address> ::= [<apt>] <house-num> <street-name> <EOL> This translates into English as: A street address consists of an optional apartment specified, followed by a house number, followed by a street name, followed by an end-of-line.
EBNF • Drawback of BNF. • Increase the readability and writability of the production rules. • New notations which are: • Braces { } – represents sequences of zero or more instances of elements. • Brackets [ ] – optional elements. • Parenthesis ( ) – group of elements.
Parse Tree • Naturally describe the syntactic structure of the language define. • Every internal node labeled as non-terminal symbol. • Every leaf is labeled with a terminal symbol • Every subtree describes one abstraction instances.
Parse Tree Example <assign> A = B * (A + C) <expr> <id> = A <expr> <id> * B ( <expr> ) <id> + <expr> A <id> C
Grammar and Recognizers • A recognizers for the language generated by the grammar can be algorithmically constructed. • One of the first syntax analyzer generator is named yacc (yet another compiler-compiler) (Johnson, 1975)
Semantics • The meaning of words and other parts of languages. • Reveal the meaning of the syntax/grammar • Categorized as follow: • Static semantics • Dynamic semantics
Static semantics • An attribute grammar (AG) where it is an extension from context free grammar (CFG). • AG is a mechanism to formalize syntax for both Context Free Grammar (CFG) and Context Sensitive Grammar (CSG). • AG used to defined the static semantics of a language with features. • Compiler – can be done at compile time
Attribute Grammar A = A + B Syntax rules : <expr> <var>[2] + <var> [3] Semantics rules : <expr>.actual_type if (<var>[2].actual_type = int) and (<var>[3].actual_type = int) then int else real end if Predicate : <expr>.actual_type == <expr>.expected_type
Dynamic semantics • Done during the run time. • Several ways to specify DS which is: • By a language references manual • Common method of describing PL • By a defining translator • Common method of questioning the behavior of PL • By a formal definition • Common method of questioning the behavior of PL by using mathematical methods. • Include operational, axiomatic and denotational
Operational semantics • Example • The while structure in C Programming while (expression) Statement; • Might be defined as following operations: • Evaluate the expression, yielding a value. • If the evaluated is True, run statements and repeat step 1. • If the evaluated is False, terminate the while statement.
Axiomatic semantics • Example • Logical statement called an assertion. • Pre-condition • Post-condition
Denotational semantics • Example • Define PL behavior by applying mathematical functions to program and program component to present their meaning. • Definition used double bracket [[ ]] to separate the syntactic definition from the semantic definition. • Example syntactic :expression 2*4 , 5+3, 008 -> integer 8 semantic : [[2*4]] = [[5+3]] = [[008]] = [[8]]