Languages and Compilers (SProg og Oversættere)

Languages and Compilers(SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Elsa Gunter who’s slides this lecture is based on.

Type Checking • When is op(arg1,…,argn) allowed? • Type checking assures that operations are applied to the right number of arguments of the right types • Right type may mean same type as was specified, or may mean that there is a predefined implicit coercion that will be applied • Used to resolve overloaded operations

Type Checking • Type checking may be done statically at compile timeor dynamically at run time • Untyped languages (eg LISP, Prolog) do only dynamic type checking • Typed languages can do most type checking statically

Dynamic Type Checking • Performed at run-time before each operation is applied • Types of variables and operations left unspecified until run-time • Same variable may be used at different types

Static Type Checking • Performed after parsing, before code generation • Type of every variable and signature of every operator must be known at compile time

Static Type Checking • Can eliminate need to store type information in data object if no dynamic type checking is needed • Catches many programming errors at earliest point

Strongly Typed Language • When no application of an operator to arguments can lead to a run-time type error, language is strongly typed • Depends on definition of “type”

Strongly Typed Language • C is “strongly typed” but type coercions may cause unexpected (undesirable) effects; no array bounds check (in fact, no runtime checks at all) • SML “strongly typed” but still must do dynamic array bounds checks, arithmetic overflow checks

How to Handle Type Mismatches • Type checking to refuse them • Apply implicit function to change type of data • Coerce int into real • Coerce char into int

Conversion Between Types: • Explicit: all conversions between different types must be specified • Implicit: some conversions between different types implied by language definition • Implicit conversions called coercions

Coercion Examples Example in Pascal: var A: real; B: integer; A := B • Implicit coercion - an automatic conversion from one type to another

Coercions Versus Conversions • When A has type int and B has type real, many languages allow coercion implicit in A := B • In the other direction, often no coercion allowed; must use explicit conversion: • A := round(B); Go to integer nearest B • A := trunc(B); Delete fractional part of B

Type Equality (aka Type Compatibility) • When are two types “the same”? • Name equivalence: two types equal only if they have the same name • Simple but restrictive • Usually loosened to allow two types to be equal when one is defined with the name of the other (declaration equivalence)

Type Equality • Structure equivalence: Two types are equivalent if the underlying data structures for each type are the same • Problem: how far to go – are two records with the same number of fields of same type, but different labels equivalent?

Elementary Data Types • Data objects contain single data value with no components • Standard elementary types include: integers, reals, characters, booleans, enumerations, pointers (references in SML)

Specification of Elementary Data Types • Basic attributes of type usually used by compiler and then discarded • Some partial type information may occur in data object • Values usually match with hardware types: 8 bits, 16 bits, 32 bits, 64 bits • Operations: primitive operations with hardware support, and user-defined operations built from primitive ones

Integers – Specification • Range of integers for some fixed minint to some fixed maxint, typically -2^31 through 2^31 – 1 or –2^30 through 2^30 - 1 • Standard collection of operators: +, -, *, /, mod, ~ (negation) • Standard relational operations: =, <, >, <=, >=, =/=

S Data Binary integer Sign bit (0 for +, 1 for -) Integers - Implementation • Implementation: • Binary representation in 2’s complement arithmetic • Three different standard representations:

S Data Integers - Implementation • First kind: Binary integer Sign bit (0 for +, 1 for -)

T Address S Data T S Data Integers – Implementation • Secondkind • Third kind Type descriptor Sign bit Type descriptor Sign bit

0 1 0 0 1 1 0 0 Integer Numeric Data • Positive values 64 + 8 + 4 = 76 sign bit

Subranges • Example (Ada): A:integer range 10..20 • Subtype of integers (implicit coercion into integer)

Subranges • Data may require fewer bits than integer type • Data in example above require only 4 bits • Range checking usually requires some runtime time information and dynamic type checking

S E M IEEE Floating Point Format • IEEE standard 754 specifies both a 32- and 64-bit standard • At least one supported by most hardware • Numbers consist of three fields: • S (sign), E (exponent), M (mantissa)

Floating Point Numbers: Theory • Every non-zero number may be uniquely written as (-1)S * 2 e* m where 1  m < 2 and S is either 0 or 1

Floating Point Numbers: Theory • Every non-zero number may be uniquely written as (-1)S * 2 (E – bias) * (1 + (M/2N)) where 0  M < 1 • N is number of bits for M (23 or 52) • Bias is 127 of 32-bit ints • Bias is 1023 for 64-bit ints

IEEE Floating Point Format (32 Bits) • S: a one-bit sign field. 0 is positive. • E: an exponent in excess-127 notation. Values (8 bits) range from 0 to 255, corresponding to exponents of 2 that range from -127 to 128.

IEEE Floating Point Format (32 Bits) • M: a mantissa of 23 bits. Since the first bit of the mantissa in a normalized number is always 1, it can be omitted and inserted automatically by the hardware, yielding an extra 24th bit of precision.

Exponent Bias • If 8 bits (256 values) +127 added to exponent to get E • If E = 127 then 127-127 = 0 is true exponent • If E = 129 then 129-127 = 2 is true exponent • If E = 120 then 120-127 = -7 is true exponent

Floating Point Number Range • In 32-bit format, the exponent has 8 bits giving a range from –127 to 128 for exponent • This give a number range from 10-38 to 1038 roughly speaking

Floating Point Number Range • In 64-bit format,the exponent is extended to 11 bits giving a range from -1023 to +1024 for the exponent • This gives a range from 10-308 to 10308 roughly speaking

Decoding IEEE format • Given E, and M, the value of the representation is: Parameters Value • E=255 and M  0 An invalid number • E=255 and M = 0  • 0<E<255 2{E-127}(1+(M/ 223)) • E=0 and M  0 2 -126 (M / 223) • E=0 and M=0 0

Example Floating Point Numbers • +1= 20*1= 2{127-127}*(1 + .0) 0 01111111 000000… • +1.5= 20*1.5= 2{127-127}*(1+ 222/ 223) 0 01111111 100000… • -5= -22*1.25= 2{129-127}*(1+ 221/ 223) 1 10000001 010000…

Other Numeric Data • Short integers (C) - 16 bit, 8 bit • Long integers (C) - 64 bit • Boolean or logical - 1 bit with value true or false (often stored as bytes) • Byte - 8 bits

Other Numeric Data • Character - Single 8-bit byte - 256 characters • ASCII is a 7 bit 128 character code • Unicode is a 16-bit character code (Java) • In C, a char variable is simply 8-bit integer numeric data

Enumerations • Motivation: Type for case analysis over a small number of symbolic values • Example: (Ada) Type DAYS is {Mon, Tues, Wed, Thu, Fri, Sat, Sun} • Implementation: Mon  0; … Sun  6 • Treated as ordered type (Mon < Wed) • In C, always implicitly coerced to integers

Pointers • A pointer type is a type in which the range of values consists of memory addresses and a special value, nil (or null) • Use of pointers to create arbitrary data structures

Pointer Data • Each pointer can point to an object of another data structure • Its l-value is its address; its r-value is the address of another object • Accessing r-value of r-value of pointer called dereferencing

Pointer Aliasing • A:= B • Numeric assignment A: A: B: B: • Pointer assignment A: A: B: B: 7.2 0.4 0.4 0.4 7.2 0.4 0.4

0.4 7.2 7.2 0.4 0.4 Problems with Pointers • Dangling Pointer A: Delete A B: • Garbage (lost heap-dynamic variables) A: A: B: B:

Ways to Create Dangling Pointers int * A, B; A = new int; A = 5; B = A; delete A; /* B is still pointing to the address of object A returned to stack */

Ways to Create Dangling Pointers int * A; int * sub () { int B; B = 5; return B;} main () { A = sub(); . . . } /* A has been assigned the address of an object that is out of scope */

SML references • An alternative to allowing pointers directly • References in SML can be typed • … but they introduce some abnormalities

SML imperative constructs • SML reference cells • Different types for location and contents x : int non-assignable integer value y : int ref location whose contents must be integer !y the contents of location y ref x expression creating new cell initialized to x • SML assignment operator := applied to memory cell and new contents • Examples y := x+3 place value of x+3 in cell y; requires x:int y := !y + 3 add 3 to contents of y and store in location y

SML examples • Create cell and change contents val x = ref “Bob”; x := “Bill”; • Create cell and increment val y = ref 0; y := !y + 1; • While loop val i = ref 0; while !i < 10 do i := !i +1; !i;

Composite Data Types • Composite data types are sets of data objects built from data objects of other types • Elements called data structures • Some created by users, eg an array of integers • Some created internally by compiler, eg symbol table, or subroutine activation record

Specification of Structured Data Types • Number of components • Fixed or varying over life of data structure • Arrays and records have fixed number • Lists have variable number • If variable number of components, is there a max number possible

Specification of Structured Data Types • Type of each component • Homogeneous: all components have same type • Arrays • Heterogeneous: components have varying types • Records (also lists in some languages, but not SML)

Specification of Structured Data Types • Method of accessing components • Array subscripting • Record labels • SML datatype pattern matching

Operations on Data Structures • Creation and deletion of structures • Whole-structure operations • Assigning to variable • Iterating a function over the structure • Computing its length or size

Languages and Compilers (SProg og Oversættere)

Languages and Compilers (SProg og Oversættere)

Presentation Transcript

Context Free Languages

History of the Filipino People

POS tagging and Chunking for Indian Languages

VB Script, ASP, ASP Objects

CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5

COP 4020 Programming Languages I

C ++ Programming Languages

COP 4020 Programming Languages I

Integrating PLTS into the Modern Languages Classroom

The Curriculum: Languages Area (Mother Tongue)

CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 2 Mälardalen University

Software Outline

ELAR and Digital Archiving for Documentation of Endangered Languages

Tolkien

Distributed Systems: Coordination models and languages

India

Languages and Compilers (SProg og Oversættere) Concurrency and distribution

Chapter 5: Other Relational Languages