360 likes | 378 Views
CS 3304 Comparative Languages. Lecture 17: Data Abstraction and Object Orientation 20 March 2012. Introduction. Control or process abstraction is a very old idea (subroutines!), though few languages provide it in a truly general form (Scheme comes close).
E N D
CS 3304Comparative Languages • Lecture 17:Data Abstraction and Object Orientation • 20 March 2012
Introduction • Control or process abstraction is a very old idea (subroutines!), though few languages provide it in a truly general form (Scheme comes close). • Data abstraction is somewhat newer, though its roots can be found in Simula67: • An Abstract Data Type is one that is defined in terms of the operations that it supports (i.e., that can be performed upon it) rather than in terms of its structure or implementation. • Why abstractions? • Easier to think about: hide what doesn't matter. • Protection: prevent access to things you shouldn't see. • Plug compatibility: • Replacement of pieces, often without recompilation, definitely without rewriting libraries. • Division of labor in software projects.
Data Abstraction Development • We talked about data abstraction some back in the unit on naming and scoping. • Recall that we traced the historical development of abstraction mechanisms: • Static set of variables: Basic. • Locals: Fortran. • Statics: Fortran, Algol 60, C. • Modules : Modula-2, Ada 83. • Module types: Euclid. • Objects: Smalltalk, C++, Eiffel, Java, Oberon, Modula-3, Ada 95. • Statics allow a subroutine to retain values from one invocation to the next, while hiding the name in-between.
Modules • Modules allow a collection of subroutines to share some statics, still with hiding: • If you want to build an abstract data type, though, you have to make the module a manager. • The abstraction provided by modules and module types has at least three important benefits: • It reduces conceptual load by minimizing the amount of detail that the programmer must deal with. • It provides fault containment: • Prevents the programmer to use a program component the wrong way. • Limits the portion of a program’s text where the component can be used. • It provides a significant degree of independence among program components – difficult to achieve.
Module-As-Manager • Module types allow the module to be the abstract data type - you can declare a bunch of them - generally more intuitive: • It avoids explicit object parameters to many operations. • One minor drawback: If you have an operation that needs to look at the innards of two different types, you'd define both types in the same manager module in Modula-2. • In C++ you need to make one of the classes (or some of its members) “friends” of the other class. • Disadvantage of the module-as-manager approach: explicit create/initialize & destroy/finalize routines for all abstractions: • Even without dynamic allocation inside module, users don't have necessary knowledge to do initialization. • Ada 83 is a little better here: you can provide initializers for pieces of private types, but this is not a general approach. • OO languages often give you constructors and maybe destructors. • Destructors: important primarily in the absence of garbage collection.
Object-Oriented (OO) Programming • OO Programming can be seen as an attempt to enhance opportunities for code reuse by making it easy to define new abstractions as extensions or refinements of existing abstractions. • Objects add inheritance and dynamic method binding. • Simula 67 introduced these, but didn't have data hiding. • The 3 key factors in OO programming: • Encapsulation (data hiding). • Inheritance. • Dynamic method binding. • The class contains: • Data members (fields). • Subroutine members (methods).
Public and Private Class Members • Public: members available to users of the abstraction. • Private: all other members required by the implementation. • Two reasons to put things in the declaration: • The declaration must contain all the information that a programmer needs to use the abstraction correctly. • The declaration must contain all the information that the compiler needs in order to generate code. • At the very least the compiler needs to know the size of an object, even though the programmer isn’t allowed to get at many or most of the fields (members) that contribute to that size: that's why private fields have to be in declaration.
C++ list_node Class Header class list_node { list_node* prev; list_node* next; list_node* head_node; public: intval; list_node(); list_node* predecessor(); list_node* successor(); bool singleton(); void insert_before(list_node * new_node); void remove(); ~list_node(); }
C++ Types of Class Members • C++ distinguishes among these types of class members: • Public members: accessible to anybody. • Protected members: accessible to members of this/derived classes. • Private: accessible just to members of this class. • C++ structure: a class whose members are public by default. • C++ base classes can also be public, private, or protected. • Example: • Anybody can convert (assign) a circle* into a shape: class circle : public shape { ... • Only members and friends of circle or its derived classes can convert (assign)a circle* into a shape: class circle : protected shape { ... • Only members and friends of circlecan convert (assign) a circle* into a shape: class circle : private shape { ...
Tiny Subroutines • OO programs tend to have: • More subroutine calls compared to regular imperative programs. • The subroutines tend to be shorter. • Avoid using public fields: use getters and setters instead. • C# property mechanism: specifically designed to facilitate the declaration of methods (accessors) to “get” and “set” values:class list_node { ... int val; public int Val { get { return val; } set { val = value; } } ...}
Using Classes • Derived (child, subclass) classes – extend class hierarchy by creating new classes from base (parent, superclass) classes: • A single root superclass: • Smalltalk, Java: Object. • C++: no such class. • General-purpose base class: contains only the fields and methods need to implement common operations. • Modifying base class methods: • Redefinition: exposes implementation details. • Leave the implementation details to the base class by invoking the method of the parent class: • Java, Smalltalk, Objective-C use super. • C# uses base. • C++ uses :: (why not super?). • Eiffel: Explicitly rename methods inherited from a base class.
Containers/Collections • Container (in OO programming) is an abstraction that holds a collection of objects of some given class. • Examples: sets, stacks, queues, and dictionaries. • Building containers, e.g. a list: • Objects are derived from container element base class: an object cannot be placed in a container unless its class is derived from the element class of the container. • List nodes are separate objects containing pointers (or references) to the listed objects, rather than the data of the objects themselves (e.g., C++/Java/C# standard libraries). • Make the list node a member (a subobject) of the listed object. • Containers illustrate a more general issue of the design of consistent, intuitive, and useful class hierarchies: a complex and difficult art.
Encapsulation and Inheritance • Encapsulation mechanism: • Grouping together in one place data and subroutines that operate on them. • Hiding irrelevant details from the users of an abstraction. • OO programming: • An extension of the “module-as-type” mechanism. • A “module-as-manager” framework. • Data hiding mechanisms of modules in non-object-oriented languages. • Data-hiding issues when adding inheritance to make classes. • Adding inheritances to records: allowing (static) modules to continue to provide data hiding.
Modules • Module-based languages: • Scope rules for data hiding. • Clu/Euclid: declaration (header) and definition(body) of a module always appear together. • Modula-2: header and body of a module can be separated but everything in header is public. • Ada: a header be divided into public and private parts. Opaque type export by putting its definition in private and naming in public part. • Changes in a separate module body do not require recompilation of module’s users. • this parameter: a single instance of each module subroutine that is passed (at run-time) the address of the storage of the appropriate module instance. • Module headers: Java/C# implementation is responsible for extracting the header information.
Classes • Inheritance: OO languages must supplement the scope rules. • C++ visibility rules: • Any class can limit the visibility of its members. • A derived class can restrict the visibility of members of a base class, but can never increase it. • A derived class that limits the visibility of members of a base class by declaring that base class protected or private can restore the visibility of individual members of the base class. • Eiffel: derived classes can both restrict and increase the visibility of members of base classes. Selective availability. • Java/C#: do not provide protected and private designations for base classes. • Smalltalk/Objective-C: code at run-time can any method name in any object.
Nesting (Inner Classes) • Many language allow class declarations to nest. • What is the visibility of the outer class’ members within the inner class methods? • C++/C#: access allowed only to static members. • Java: access allowed to arbitrary members of the outer class. • Each instance of the inner class belong to an instance of the outer class. • Classes can be nested inside methods: local class. • Anonymous classes: appears in-line inside a call to new.
Type Extensions • Smalltalk, Objective-C, Eiffel, Java, C#: designed from outset as object-oriented languages. • Modula-3, Ada 95, Oberon, CLOS, Fortran 2003: object-oriented extensions to languages in which modules already provide encapsulation. • No changes of the existing module mechanism. • Inheritance and dynamic method binding through a mechanism for extending records. • Ada 95: • Tagged types provide inheritance. • Packages provide encapsulation.
Extending without Inheritance • The desire to extend the functionality to an existing abstraction is one of the principal motivations for object-oriented programming. • Sometimes inheritance is not allowed: • Java: final class. • C#: sealed class. • C# 3.0: extension methods give the appearance of extending an existing class:static clasAddToString { public static inttoInt(this string s) { return int.Parse(s); }} • An extension method must be static and declared in a static class.
Initialization and Finalization • Choosing a constructor. • References and values: • If variables are references, then every object must be created explicitly - appropriate constructor is called. • If variables are values, then object creation can happen implicitly as a result of elaboration. • Execution order: • When an object of a derived class is created in C++, the constructors for any base classes will be executed before the constructor for the derived class. • Garbage collection.
Constructors • The lifetime of an object to be the interval during which it occupies space and can hold data. • Most object-oriented languages provide some sort of special mechanism to initialize an object automatically at the beginning of its lifetime. • When written in the form of a subroutine, this mechanism is known as a constructor. • A constructor does not allocate space. • A few languages provide a similar destructor mechanism to finalize an object automatically at the end of its lifetime.
References and Values • Java/Simula/Smalltalk: a variable refers to an object. • C++/Ada 95: a variable can have a value that is an object. • The value/reference tradeoff. • C++ characteristics: • Every object must ne initialized before it can be used. • Declarations and constructors: a variable of class type declared with no initial value: zero-argument constructor. • Copy constructor: declare a new object whose initial value is “the same” as that of the existing object: • A single argument constructor. • Assignment versus initialization.
Execution Order • Constructors: the constructor of the base class is called before the constructor of the derived class. • How does the C++ compiler obtain arguments for the base class constructor? • The header of the constructor of the derived class specifies the base class constructor arguments. • Similar syntax can be used to specify constructor arguments or initial values for members of the class. • Java also requires that the constructor of the base class is called before the constructor of the derived class:super(args); • If missing, Java automatically inserts zero-argument constructor call. • All members initialized to null. • Objective-C: no special notion of constructor.
Garbage Collection • When a C++ objects is destroyed, destructors are called in the reverse order of inheritance. • By far the most common use of destructors is manual storage reclamation. • The whole idea of garbage collection is to avoid explicit destruction. • Java/C# provide finalize() method called immediately before the garbage collector reclaims the space for an object.
Dynamic Method Binding • Virtual functions in C++ are an example of dynamic method binding: you don't know at compile time what type the object referred to by a variable will be at run time. • Simula also had virtual functions (all of which are abstract). • In Smalltalk, Eiffel, Modula-3, and Java all member functions are virtual. • Note that inheritance does not obviate the need for generics: • You might think: hey, I can define an abstract list class and then derive int_list, person_list, etc. from it, but the problem is you won’t be able to talk about the elements because you won't know their types. • That's what generics are for: abstracting over types. • Java doesn't have generics, but it does have (checked) dynamic casts.
Semantic and Performance • Data members of classes are implemented just like structures (records). • With (single) inheritance, derived classes have extra fields at the end. • A pointer to the parent and a pointer to the child contain the same address: the child just knows that the struct goes farther than the parent does.
Virtual and Nonvirtual Methods • Non-virtual functions require no space at run time; the compiler just calls the appropriate version, based on type of variable: • Member functions are passed an extra, hidden, initial parameter: this (called current in Eiffel and self in Smalltalk). • C++ philosophy is to avoid run-time overhead whenever possible(Sort of the legacy from C): • Languages like Smalltalk have (much) more run-time support.
Member Lookup • Virtual functions arethe only thing thatrequires anytrickiness (Figure). • They areimplemented bycreating a dispatch table (vtable) for the class and putting a pointer to that table in the data of the object. • Objects of a derived class have a different dispatch table. • In the dispatch table, functions defined in the parent come first, though some of the pointers point to overridden versions. • You could put the whole dispatch table in the object itself. • That would save a little time, but potentially waste space.
Single Inheritance • Note that if youcan query thetype of an object,then you need tobe able to get from the object to run-time type info. • The standard implementation technique is to put a pointer to the type info at the beginning of the vtable. • Of course you only have a vtable in C++ if your class has virtual functions. • That's why you can't do a dynamic cast on a pointer whose static type doesn't have virtual functions.
Polymorphism • Dynamic method binding introduces subtype polymorphism into any code that expects a reference to an object of a specific class. • An object of the derived class that supports the operations of the base class can be also used. • A combination of inheritance and dynamics methods still does not eliminate the need for generics (see Example 9.45): • Needed to avoid tedious type casting. • Needed to avoid potentially unsafe code. • Generics exist for the purpose of abstracting over unrelated types, something that inheritance does not support. • Eiffel, Java, and C# also provide generics. • Virtual methods often preclude the in-line expansion of subroutines at compile time.
Object Closures • Encapsulate a method with context for later execution. • Encapsulate a method and its arguments for later execution (Figure 9.5):class fn_call {public: virtual void trigger() = 0;};…class call_foo: public fn_call { int arg1; …public: call_foo(int a, …) { arg1(a), … } void trigger() { foor(arg1, …); }} • Java uses that to encapsulate start-up arguments for newly created threads of control.
Multiple Inheritance I • In C++, you can sayclass professor : public teacher, public researcher { ...}Here you get all the members of teacher and all the members of researcher • If there's anything that's in both (same name and argument types), then calls to the member are ambiguous; the compiler disallows them. • Virtual base classes: In the usual case if you inherit from two classes that are both derived from some other class B, your implementation includes two copies of B’s data members. • That's often fine, but other times you want a single copy of B. • For that you make B a virtual base class.
Multiple Inheritance II • You can of course create your own member in the merged class professor::print () { teacher::print (); researcher::print (); ... }Or you could get both: professor::tprint () { teacher::print (); } professor::rprint () { researcher::print (); }
OO Programming Revisited • Anthropomorphism is central to the OO paradigm - you think in terms of real-world objects that interact to get things done. • Many OO languages are strictly sequential, but the model adapts well to parallelism as well. • Strict interpretation of the term: • Uniform data abstraction: everything is an object. • Inheritance. • Dynamic method binding. • Lots of conflicting uses of the term out there object-oriented style available in many languages: • Data abstraction crucial. • Inheritance required by most users of the term object-oriented. • Centrality of dynamic method binding a matter of dispute.
OO Languages • Modula-3: single inheritance, all methods virtual, no constructors or destructors. • Ada 95: tagged types, single inheritance, no constructors or destructors, notion of child packages as an alternative to friends, class-wide parameters: • Methods static by default. • Can define a parameter or pointer that grabs the object-specific version of all methods: base class doesn't have to decide what will be virtual. • Java: interfaces, mix-in inheritance alternative to multiple inheritance, all methods virtual. • Is C++ object-oriented? Uses all the right buzzwords.
Smalltalk Object Model • SMALLTALK is the canonical object-oriented language • It has all three of the characteristics listed above. • It's based on the thesis work of Alan Kay at Utah in the late 1960‘s. • It went through five generations at Xerox PARC, where Kay worked after graduating. • Smalltalk-80 is the current standard.
Summary • Three fundamental concepts of object-oriented programming: encapsulation, inheritance, and dynamic method binding. • Inheritance provides a natural basis for polymorphic subroutines. • Dynamic method binding extends polymorphism. • Type extension mechanism: encapsulation is associated with modules, but inheritance and dynamic method binding are associated with a special form of record. • Smalltalk is the purest and most flexible of the object-oriented languages.