190 likes | 338 Views
Introduction and Syntax. Course objectives. Discuss features of programming languages. Discuss how the features are implemented in a simple computer architecture (what is happening under the hood). Introduce a few specific programming languages: C++, Java, Ada, Scheme, ML. Desiderata.
E N D
Course objectives • Discuss features of programming languages. • Discuss how the features are implemented in a simple computer architecture (what is happening under the hood). • Introduce a few specific programming languages: C++, Java, Ada, Scheme, ML.
Desiderata Ease of human use: Writing, debugging, reading, maintaining. Automated/formal analysis Efficient implementation Portability: Machine independence
Categories of languages • Imperative (Assembly, FORTRAN, Algol, C, Ada …). Statement modifying memory locations execute in sequence. • Object-oriented (Smalltalk, C++, Java): Computation via interaction of objects. • Functional: (LISP, Scheme, ML). Computation as definition and application of functions.
Categories of languages (cntd) • Logic-based (Prolog): Computation as inference from statements and rules. • Special purpose (Postscript, Javascript, database languages): PL’s geared toward a specific application.
Components of a PL • Syntax: What constitutes a well-formed program? • Semantics: What computational activity constitutes a proper execution of a given program on a given input? • Implementation.
Compilers and interpreters An interpreter preserves the text of the program, and executes the program by constant referral to the text. A compiler translates the program from the source language into a target language. This may be either machine language or (more commonly) a lower level language (e.g. assembly or C).
Compilers and interpreters (cntd) The distinction between compiled and interpreted languages is not clear cut: • Most interpreters begin by doing some degree of preprocessing (e.g. comment removal). • The target language may then be interpreted (e.g. Java compiles into byte language, interpreted on JVM). • Machine language is interpreted in microcode.
Advantages of interpreters • Some language features are difficult or impossible to compile (e.g. writing and executing code on the fly). • Portability and machine independence (This is the main reason that Java byte code and Javascript are interpreted.) • Interpreters are easier to write than compilers.
Advantages of interpreters (cntd) Interpreters also make it easier to get • Interactive environments • Source code is available for debugging. • Light-weight coding (e.g. no declarations) though it is not impossible to get these feature in compiled languages.
Advantage of compilers • Speed of compiled code. One or two orders of magnitude. • Can distribute object code without publishing source code.
Time of features A compile-time feature of a program can be determined before execution begins (and is therefore independent of the input data). A run-time feature of a program can only be detected during execution and is generally dependent on the input.
Syntax The syntax of a program is its formal structure. The line separating syntactic from non-syntactic features is somewhat arbitrary. All syntactic features are detectable at compile time, but not vice versa. E.g. it is detectable at run time that “x=1/0” is an error, but this is not (generally) a syntactic error.
Tokens In almost all programming languages, a program is a sequence of tokens. Types of tokens: • Special symbols: “;”, “+”, “++”, “{“ … • Reserved words: “if”, “then”, “function” … • Numbers: “5”, “4123”, “1.2E+08”, “0x4AC” • Identifiers: “x”, “i”, “append”, “employee” …
Tokens (cntd) Language specific rules for: • What forms are in each of these categories. • How tokens are delimited (usually by white space or special symbols, but not in FORTRAN) • When two tokens are the same (e.g. case sensitivity)
Tokens (cntd) Lexer or tokenizer divides the source code into tokens and categorizes the tokens. Generally, this is done by a regular language = finite automaton.
Syntax tree int increment (int i) { return (1+i); }
Backus-Naur Form (BNF) FunDefn ::= FunDecl Block Term ::= num | var | Term arithOp Term | funCall ArgList ::= () | (VarDecl [, VarDecl]*)
Context-Free Language A BNF definition defines a context-free language. The syntax of programming languages is “almost” context free. A few syntactic constraints are not (e.g. identifiers must be declared before they are used).