190 likes | 210 Views
This paper explores the importance of defining data types in programming languages, highlighting the concepts of abstract data types, user-defined types, and the clear definition of what constitutes a "type." It discusses various ways data types have been defined, the shortcomings of existing definitions, and proposes a novel approach based on equivalence classes of variables. The benefits of allowing type extension and the goals of user-defined types are also examined, emphasizing the role of types as classes of variables for writing reusable and strongly typed code. The paper aims to bridge the gap between traditional definitions and modern programming language requirements.
E N D
Abstract Types Defined as Classes of Variables D. L. Parnas, J.E. Shore, D.M. Weiss
Abstract Data Types • Defining Data Types is Important • Scalar types are somewhat consistently defined for a programming language, and are precisely defined for a given compiler and runtime environment. • Ex: What is the sizeof(int)? • Definition of an abstract data type tends to be less clear • Furthermore some clarity is necessary about the differences between a user-defined type and an abstract type • This paper is about the definition of what we refer to as a “type” in programming languages
Why is the Definition of Data Types Important? • Existing programming languages allow the user to define new types • “typedef” in C • Classes in Java/C++ • Is every user-defined type really a “type” from a programming language perspective? • Formal definition of a “type” is necessary • Remember the age of this paper and put this paper into modern day perspective
Ways that types have been Defined • Syntactic: Type is information given to a variable in its definition • Value Space: A type is defined/constrained by a possible set of allowable values • Behavior: A type is defined by a value space and a set of operations on the elements in that space • Representation: Any type can be defined in terms of the primitive types for which is was defined • Representation + Behavior: A type is determined by its representation and the set of operations that defines its behavior. Some of these definitions sound like how we define a class…
The Problem… • The proposed definitions fall short • Consider the desire for strongly typed languages with a clear and simple set of semantic rules • Consider the desire of abstracting the design of code that needs to work with multiple type classes • Parnas’ analysis of the previously used definitions resulted in: • Exclusion of important practical cases • Exceptions that complicated the basic language semantics
Approach for Defining a Type • Parnas defines a type as equivalence classes of variables. • Variables are considered to be primitive • Variables will be used in the definition of mode (class of variable) and type The novelty of Parnas’ approach is based on the observationthat a type can be defined in terms of a variable and itspermitted contexts.
Why Allow Types to be Extended? • Type extension support is not a necessary feature of a programming language – they don’t increase the class of functions that can be computed by a given language – but they are helpful. • Abstraction – User defined types provide an abstraction from using an equivalent set of primitive types. • Redundancy and Compile Time Checking – Provides more information to the compiler than using primitive types – can help the compiler to produce better code. • Abbreviation – User types tend to offer an abbreviated syntax. This simplifies coding making programs easier to build and easier to understand. • Portability – Defining user types is a good way to manage portability given that the program may need to be recompiled and executed on different platforms. E.g., Is an int 16 or 32 bits?
The Existing Definitions Don’t Map to the Goals of Clearly Defining a Data Type Objectives/Goals of Type Extension Support Existing Definitions • Syntactic • Value Space • Behavior • Representation • Representation + Behavior • Abstraction • Redundancy + Compile Time Checking • Abbreviation • Portability ? The mapping ot the existing definitions to the objectives of user-defined types is unclear. Its much clearer for scalar types
Parnas’ Approach for Defining a Type • To achieve the goals outlined for defining a type one must consider the situations in which one variable may be substituted for another • The existing definitions tend to place restrictions on the context in which variable substitutions are allowed • Parnas’ Approach: A variable is considered primitive, and types are defined as various equivalence classes of variables • The key to this approach is to clearly define when one variable can be substituted for another variable legally.
Mode of a Variable • A mode is a group of variables that are represented and accessed in the same manner • Defines an equivalence class on variables. (i.e., Any value that can be stored in a particular variable of a given mode can be stored in any variable of that mode) • The mode of a variable provides enough information for the compiler to generate code
Types = Classes of Modes • Each mode defines a class of variables • Variables that can be substituted for each other in any context will not produce a compile-time error • A type is a class of variables that can be substituted for each other in some restricted (compiler defined) contexts • Types may consist of more than one mode (mode classes) • Supports the goal of writing reusable (type-flexible) code that is strongly type-checked
Abstract Data Types • An ADT is a type that includes more than one mode and can deal with all members of the mode class without distinguishing between them • Enables developers to write more generic (and possibly more reusable) code • Consider templates and polymorphism • User Defined Types ≠ Abstract Types • ADT are special cases of user-defined types
Conditions for Grouping Modes into Types • Grouping modes into types can be useful for solving common programming problems elegantly • Spec-Types • Rep-Types • Param-Types • Variant Types
Spec-Types • Types consisting of modes with identical externally visible behavior • Appropriate for types that can be defined solely on operators allowed on the type. • These variable types tend to have a well-defined specification int x = 10; int y = 20; int z = Math.min(x,y); System.out.println (“Min is ” + z); long x = 10; long y = 20; long z = Math.min(x,y); System.out.println (“Min is ” + z);
Rep-Types • Types consisting of modes with identical representation • Types with identical internal representations are of the same rep-type but may not have the same characteristics when the operators are applied to the representation. • Example: Consider booleans in “C”. They are integers, but a boolean with the value 0 or 1 is different from an integer with a value of 0 or 1. Also think about enums, bitmasks, etc. in the “C” language • These are some of the reasons why “C” is not considered a type-safe language
Param-Types • Mode descriptions can be parameterized • The class of all modes that can be obtained by passing type information as a parameter can be considered a type (param-type) • Helpful when the goal is code sharing • Param-Types enable the same code to work on variables of different types • Mode information needs to be parameterized • Example: Templates in C++
Variant-Types • Variant-Types are types consisting of Modes with some common properties • It’s a weaker form of a Spec-Type • Enables programs to be written using the common properties of a type, while ignoring the special properties of a given type. • Example 1: Inheritance: an object may be treated like any other type defined in its inheritance hierarchy. • Example 2: Polymorphism: an object of a specific type may be used interchangeably with an object of another type if they share some common properties, but have different behaviors
Flexibility in Type Definition and Code Sharing • Having a flexible definition of type enables various degrees of code sharing: • Sometimes the same code can operate on different types (param-types) • Sometimes source code can be shared (but not the binaries) since the implementation is machine dependant (spec-types) • Sometimes it is necessary for code sharing purposes to determine type information at runtime (not all languages support this – but consider the instanceof() operator in Java).
Summary • Defining a type is important for programming language design, and the effective usage of a programming language • Consider code reuse • At the time of this paper (1975) many of the concepts about typing were not implemented in available programming languages, but many of these concepts are supported in today’s programming languages • This paper is another case where Parnas was thinking ahead and his observations have been shown to be “right on”.