190 likes | 313 Views
CS 331, Principles of Programming Languages. Chapter 4 Types: Data Representation. Data Representation Issues. Storage Management automatic, static Scope Internal vs. External. Type Concepts. Instances of a type individual constants, variables, expressions Basic types
E N D
CS 331, Principles of Programming Languages Chapter 4 Types: Data Representation
Data Representation Issues • Storage Management • automatic, static • Scope • Internal vs. External
Type Concepts • Instances of a type • individual constants, variables, expressions • Basic types • built into a language, e.g. float and int in C • instances of basic types are also known as first-class objects • User-defined types • defined using type expressions involving existing types
Basic and User-defined Types • In most modern languages, basic types include integer, real, character, and Boolean • Most PLs have two mechanisms for user-defined types, namely array and record
Internal vs. External Representations • Integers in C are not the same as integers in arithmetic • word length introduces issues related to overflow e.g. -32767..32768 • so some operations don’t act as they should • Floats and doubles are not the same as real numbers (or rationals!) • conversion to binary introduces small errors
Storage Classes • Typically, automatic variables are associated with a given block • allocated (and initialized) when block is entered • known only within that block • freed when block is exited • Compare to static variables, which are associated with a given block, but are allocated and initialized only once
User-directed Allocation • In C, we have malloc and free • memory leaks can be a problem • uninitialized or invalid pointers can be problems • In C++ we have constructors and destructors • In Pascal and Modula, we have new() and dispose() • In Lisp and Java, garbage collection is used
Data Aggregates • Arrays are homogeneous • all the elements are of the same type • Records (known as structs in C) are heterogeneous • components may be of different types • Sets are available in languages like Pascal and Modula • Type expressions are used to define these
Storage Allocation for Arrays • Typically, array elements occupy a contiguous block of memory • hence C’s use of subscript 0 • For multi-dimensional arrays, there are two schemes • row-major: the rightmost subscript varies fastest • column-major: the leftmost subscript varies
Records • An example in C • In Modula, it would be struct date { int day; int month; int year; } DoB, DoD; TYPE date = RECORD day, month, year: CARDINAL END; VAR DoB, DoD: date;
linuxbeta.gl.umbc.edu> cat -n recordeg.mod 1 MODULE recordeg; 2 3 FROM InOut IMPORT WriteLn, WriteCard; 4 5 TYPE date = RECORD 6 day, month, year: CARDINAL 7 END; 8 VAR DoB, DoD: date; 9 10 BEGIN 11 DoB.year := 1808; 12 DoD.year := 1865; 13 WriteCard(DoB.year,10); 14 WriteCard(DoD.year,10); 15 WriteLn; 16 END recordeg. linuxbeta.gl.umbc.edu> source modulasetup linuxbeta.gl.umbc.edu> gpmodula recordeg.mod linuxbeta.gl.umbc.edu> build recordeg linuxbeta.gl.umbc.edu> recordeg 1808 1865 linuxbeta.gl.umbc.edu>
Varying Records in C type enum field_type {integer, real}; struct { ft field_type; union { int I; float F; } data; } foo; if (foo.data.ft == integer) printf(“%d”, foo.data.I); else if (foo.data.ft == real) printf(“%f”, foo.data.F); else printf(“error”); • Note that foo.data.I or foo.data.F are defined, but not both • Allows data of different types to share storage • Polymorphic data types!
Varying Records in Modula TYPE field_type = {integer, real}; TYPE foo_type = RECORD CASE ft: field_type OF I: INTEGER; F: REAL; END; (* of CASE *) END; (* of RECORD *) VAR foo: foo_type; IF (foo.ft = integer) WriteInt(foo.I) ELSE IF (foo.ft = real) WriteReal(foo.F) ELSE WriteString(“error”);
Sets as Types • There are situations where sets come in handy • when only certain data values are allowed, e.g. program options or file permissions, and no existing subrange type is appropriate • [Mon..Sun] is not a set (Sethi book has a mistake on p. 123)
Example: Sets in Modula or Pascal • Operations of membership (IN), union (+), intersection (*), and set difference (-) are common • Commonly implemented as bit strings TYPE colors = (white,yellow,blue,green,cyan,black,red); (* note that white < yellow < blue < … < red *) VAR CRT1, CRT2: SET OF colors; VAR testColor: colors; CRT1 := {cyan,yellow,green}; CRT2 := {red,green,blue}; IF (testColor IN CRT1) THEN WriteLn;
Type Coercion • PLs differ in their approach to type coercion • coercion refers to automatic conversion from one type to another • if a PL is strongly typed, then coercion is restricted and/or explicit • if a PL is weakly typed, then coercion is taken care of by the compiler, which might cause errors • in recent years, the trend is towards strong typing
Determining an Object’s Type • For static or automatic objects it’s easy • static float x; int y; • For other objects it can be hard if (x) float *f = new float[250]; else char *f = new char[1000]; … /* other statements */ /* the following statement must have a semantic error, but at compile time C++ can’t tell which one is wrong since it can’t know x in advance */ cout << sqrt(f[0]) << strlen(f);
Static and Dynamic Checking • Type checking is needed to make sure operations are well-defined on the objects to which they’re being applied • Static type-checking is done once, typically at compile-time • Dynamic type-checking is done whenever an operation is applied to an object whose type could not be determined in advance
How is Dynamic Type-Checking Done? • When an object is created, a “tag” is attached to that object to indicate its type • When that object is involved in some operation, the tag is checked to make sure that the operation is defined on such objects • A hassle in terms of storage and execution time • Smalltalk and other O-O PLs do this, but not C++