230 likes | 243 Views
Explore the concept of data types, their significance, and the role they play in program semantics. Learn about strong typing, static typing, type systems, type checking, and type conversion. Delve into scalar and composite types along with type equivalence and compatibility rules.
E N D
Chapter 7:: Data Types Programming Language Pragmatics Michael L. Scott Adapted from Scott, 2006
Data Types • We all have developed an intuitive notion of what types are; what's behind the intuition? • collection of values from a "domain" (the denotational approach) • internal structure or content of data, described down to the level of a small set of fundamental types (the structural approach) • an interface providing a set of well defined operations (abstraction approach used by OO languages) Adapted from Scott, 2006
Data Types • What are types good for? • implicit context for many operations • e.g., what operation should be used for (a + b)? • checking of program semantics - making sure that certain meaningless operations do not occur • type checking cannot prevent all meaningless operations • can catch enough of them to be useful Adapted from Scott, 2006
Data Types • STRONG TYPING has become a popular buzz-word • like structured programming • informally, it means that the language prevents you from applying an operation to data on which it is not appropriate • This was Pascal's primary selling point in the 1980s • STATIC TYPING means that the language is strongly typed, and that the compiler can do all the checking at compile time Adapted from Scott, 2006
Type Systems • Examples • C is weakly typed (and checks almost nothing at runtime) • Common Lisp is strongly typed, but not statically typed • Ada is statically typed • Pascal is almost statically typed • Except for variant records • Java is strongly typed, with a non-trivialmix of things that can be checked statically and things that have to be checked dynamically Adapted from Scott, 2006
Type Systems • Common terms: • Scalar types - one-dimensional • Discrete types (ordinal types) – countable • integer • boolean • char • real • rational • complex Adapted from Scott, 2006
Type Systems • Composite (nonscalar) types: • records – introduced by Cobol • Called struct in C, C++ • arrays • indexed composite types • strings • sets – introduced by Pascal • pointers – l-values • reference to a base type • lists • files Adapted from Scott, 2006
Type Checking • A TYPE SYSTEM has rules for • type equivalence (when are the types of two values the same?) • type compatibility (when can a value of type A be used in a context that expects type B?) • type inference (what is the type of an expression, given the types of the operands?) Adapted from Scott, 2006
Type Checking • Type compatibility / type equivalence • Compatibility is the more useful concept, because it tells you what you can DO • The terms are often (incorrectly, but we do it too) used interchangeably. Adapted from Scott, 2006
Type Checking • Two major approaches: • structural equivalence and • name equivalence • Name equivalence is based on programmer's declarations • Structural equivalence is based on the content of the types • Name equivalence is more rigorous • If the programmer makes the effort to name 2 types differently, the programmer probably wants them to be treated as different, even if their structures are identical. Adapted from Scott, 2006
Type Checking • Structural equivalence depends on simple comparison of type descriptions • substitute out all names • expand all the way to built-in types • Original types are equivalent if the expanded type descriptions are the same Adapted from Scott, 2006
Type Conversion • Converting one type to another (casting) is required when: • Types are structurally equivalent, but the language uses name equivalence. • The conversion is only conceptual, not physical • The types have different but intersecting sets of values (e.g., one is a subrange of the other) • Runtime check tests the validity of the conversion • Types are physically different, but values of one type correspond to values of the other • e.g., all integers can be represented as reals • Nonconverting cast • Treat a variable of one type as another type, without changing the physical representation • e.g., treating a char as an int in C Adapted from Scott, 2006
Type Checking • Coercion • Automatic, implicit type conversion • When an expression of one type is used in a context where a different type is expected, one normally gets a type error • But what aboutvar a : integer; b, c : real; ... c := a + b; Adapted from Scott, 2006
Type Checking • Coercion • Many languages allow things like this, and COERCE an expression to be of the proper type • Fortran has lots of coercion, all based on operand type • C has lots of coercion, too, but with simpler rules: • all floats in expressions become doubles • short int and char become int in expressions Adapted from Scott, 2006
Type Checking • In effect, coercion rules are a relaxation of type checking • Recent thought is that this is probably a bad idea • Languages such as Modula-2 and Ada do not permit coercions • C++, however, provides programmer-extensible coercion rules • They're one of the hardest parts of the language to understand Adapted from Scott, 2006
Arrays • Arrays are the most common and important composite data types • Unlike records, which group related fields of disparate types, arrays are usually homogeneous • Semantically, arrays can be thought of as a mapping from an index type to a component or element type • Usually the only operations permitted are selection of an element and assignment, however • Fortran 90 offers many array operations supporting matrix algebra • Ada and Fortran 90 allow arrays to be compared for equality Adapted from Scott, 2006
Arrays • Dimensions, Bounds, and Allocation • global lifetime, static shape — If the shape of an array is known at compile time, and if the array can exist throughout the execution of the program, then the compiler can allocate space for the array in static global memory • local lifetime, static shape — If the shape of the array is known at compile time, but the array should not exist throughout the execution of the program, then space can be allocated in the subroutine’s stack frame at run time. • arbitrary lifetime, shape bound at elaboration time— In Java and C# an array is a reference to an object, whose space is allocated on the heap Adapted from Scott, 2006
Arrays • Contiguous elements (see next slide) • column major - only in Fortran • (and only due to a peculiarity of the IBM 704 computer, on which Fortran was first implemented) • row major • used by everybody else • Makes a multidimensional array simply an array of subarrays Adapted from Scott, 2006
Arrays Adapted from Scott, 2006
Arrays • Two layout strategies for arrays (see next slide): • Contiguous elements • Row pointers • Row pointers – an array is an array of pointers to arrays • an option in C • allows rows to be put anywhere - nice for big arrays on machines with segmentation problems • nice for matrices whose rows are of different lengths • e.g. an array of strings • avoids multiplication for offset calculation • requires extra space for the pointers Adapted from Scott, 2006
Arrays Adapted from Scott, 2006
Strings • Strings are really just arrays of characters • They are often special-cased, to give them flexibility (like dynamic sizing) that is not available for arrays in general • It's easier to provide these things for strings than for arrays in general because strings are one-dimensional Adapted from Scott, 2006
Equality testing • Shallow comparison • Do the expressions refer to the same object? • Address comparison • l-value comparison • Deep comparison • Do the expressions refer to objects which are in some sense equal/equivalent? • Value(s) comparison • r-value comparison Adapted from Scott, 2006