470 likes | 716 Views
COEN 171 - Data Abstraction and OOP. Data Abstraction Problems with subprogram abstraction Encapsulation Data abstraction Language issues for ADTs Examples Ada C++ Java Parameterized ADTs. COEN 171 - Data Abstraction and OOP. Object-oriented programming
E N D
COEN 171 - Data Abstraction and OOP • Data Abstraction • Problems with subprogram abstraction • Encapsulation • Data abstraction • Language issues for ADTs • Examples • Ada • C++ • Java • Parameterized ADTs
COEN 171 - Data Abstraction and OOP • Object-oriented programming • Components of object-oriented programming languages • Fundamental properties of the object-oriented model • Relation to data abstraction • Design issues for OOPL • Examples • Smalltalk 80 • C++ • Ada 95 • Java • Comparisons • C++ and Smalltalk • C++ and Ada 95 • C++ and Java • Implementation issues
Subprogram Problems • No way to selectively provide visibility for subprograms • No convenient ways to collect subprograms together to perform a set of services • Program that uses subprogram (client program) must know details of all data structures used by subprogram • client can “work around” services provided by subprogram • hard to make client independent of implementation techniques for data structures • discourages reuse • Difficult to build on and modify the services provided by subprogram • Many languages don’t provide for separately compiled subprograms
Encapsulation • One solution • a grouping of subprograms that are logically related that can be separately compiled • called encapsulations • Examples of encapsulation mechanisms • nested subprograms in some ALGOL-like languages • Pascal • FORTRAN 77 and C • files containing one or more subprograms can be independently compiled • FORTRAN 90, Modula-2, Modula-3, C++, Ada (and other contemporary languages) • separately compilable modules
Data Abstraction • A better solution than just encapsulation • Can write programs that depend on abstract properties of a type, rather than implementation • Informally, an Abstract Data Type (ADT) is a [collection of] data structures and operations on those data structures • example is floating point number • can define variables of that type • operations are predefined • representation is hidden and can’t manipulate except through built-in operations • ADT • isolates programs from the representation • maintains integrity of data structure by preventing direct manipulation
Data Abstraction (continued) • Formally, an ADT is a user-defined data type where • the representation of and operations on objects of the type are defined in a single syntactic unit; also, other units can create objects of the type. • the representation of objects of the type is hidden from the program units that use these objects, so the only operations possible are those provided in the type's definition. • Advantages of first restriction are same as those for encapsulation • program organization • modifiability (everything associated with a data structure is together) • separate compilation
Data Abstraction (continued) • Advantage of second restriction is reliability • by hiding the data representations, user code cannot directly access objects of the type • user code cannot depend on the representation, allowing the representation to be changed without affecting user code • By this definition, built-in types are ADTs • e.g., int type in C • the representation is hidden • operations are all built-in • user programs can define objects of int type • User-defined abstract data types must have the same characteristics as built-in abstract data types
Data Abstraction (continued) • ADTs provide mechanisms to limit visibility • public part indicates what can be seen (and used from) outside • what is exported • private part describes what will be hidden from clients • made available to allow compiler to determine needed information • C++ allows specified program units access to the private information • friend functions and classes
Language Issues for ADTs • Language requirements for data abstraction • a syntactic unit in which to encapsulate the type definition. • a method of making type names and subprogram headers visible to clients, while hiding actual definitions • public/private • some primitive operations must be built into the language processor (usually just assignment and comparisons for equality and inequality) • some operations are commonly needed, but must be defined by the type designer • e.g., iterators, constructors, destructors • Can put ADTs in PL • as a type definition extended to include operations (C++) • use directly to declare variables • as a collection of objects and operations (Ada) • may need to be instantiated before declaring variables
Language Issues for ADTs (continued) • Language design issues • encapsulate a single type, or something more? • what types can be abstract? • can abstract types be parameterized? • how are imported types and operations qualified? • Simula-67 was first language to address this issue • classes provided encapsulation, but no information hiding
Data Abstraction in Ada • Abstraction mechanism is the package • Each package has two pieces (can be in same or separate files) • specification • public part • private part • body • implementation of all operations exported in public part • may include other procedures, functions, type and variable declarations, which are hidden from clients • all variables are static • may provide initialization section • executed when declaration involving package is elaborated • Any type can be exported • Operations on exported types may be restricted • private (:=, =, /=, plus operations exported) • limited private (only operations exported)
Data Abstraction in Ada (continued) • Evaluation • exporting any type as private is good • cost is recompilation of clients when the representation is changed • can’t import specific entities from other packages • good facilities for separate compilation
Data Abstraction in C++ • Based on C struct type and Simula 67 classes • Class is the encapsulation device • all of the class instances of a class share a single copy of the member functions • each instance of a class has its own copy of the class data members • instances can be static, semidynamic, or explicit dynamic • Information Hiding • private clause for hidden entities • public clause for interface entities • protected clause - for inheritance
Data Abstraction in C++ (continued) • Constructors • functions to initialize the data members of instances • may also allocate storage if part of the object is heap-dynamic • can include parameters to provide parameterization of the objects • implicitly called when an instance is created • can be explicitly called • name is the same as the class name • Destructors • functions to cleanup after an instance is destroyed; usually just to reclaim heap storage • implicitly called when the object’s lifetime ends • can be explicitly called • name is the class name, preceded by a tilda (~)
Data Abstraction in C++ (continued) • Friend functions • allow access to private members to some unrelated units or functions • Evaluation • classes are similar to Ada packages for providing abstract data type • difference is packages are encapsulations, whereas classes are types
Data Abstraction in Java • Similar to C++ except • all user-defined types are classes • all objects are allocated from the heap and accessed through reference variables • individual entities in classes (methods and variables) have access control modifiers (public or private), rather than C++ clauses • functions can only be defined in classes • Java has a second scoping mechanism, package scope, that is used instead of friends • all entities in all classes in a package that don’t have access control modifiers are visible throughout the package
Parameterized ADTs • Ada generic packages may be parameterized with • type of element stored in data structure • operators among those elements • Must be instantiated before declaring variables • instantiation of generic behaves like text substitution • package BST_Integer is new binary_search _tree(INTEGER) • like text of generic package substituted here, with parameters substituted • EXCEPT references to non-local variables, etc. occur as if happen at point where generic was declared • If have multiple instantiations, need to disambiguate when declare exported types • package BST_Real is new binary_search_tree(REAL) • tree1: BST_Integer.bst; • tree2: BST_Real.bst;
Parameterized ADTs (continued) • C++ • classes can be somewhat generic by writing parameterized constructor functions • class itself may be parameterized as a templated class • Java doesn’t support generic abstract data types stack (int size) { stk_ptr = new int [size]; max_len = size - 1; top = -1; } stack (100) stk;
Object-Oriented Programming • The problem with Abstract Data Types is that they are static • can’t modify types or operations • except for generics/templates • means extra work to modify existing ADTs • Object-oriented programming (OOP) languages extend data abstraction ideas to • allow hierarchies of abstractions • make modifying existing abstractions for other uses very easy • Leads to new approach to programming • identify real world objects of problem domain and processing required of them • create simulations of those objects, processes, and the communication between them by modifying existing objects whenever possible
Object-Oriented Programming (continued) • Two approaches to designing OOPL • start from scratch (Smalltalk 1972!!) • allows cleaner design • better integration of object features • no installed base • modify an existing PL (C++, Ada 95) • can build on body of existing code • OO features usually not as smoothly integrated • backward compatibility issues of warts from initial language design
OOPL Components • Object: encapsulated operations plus local variables that define an object’s state • state is retained between executions • objects send and receive messages • Messages: requests from sender to receiver to perform work • can be parameterized • in pure OOL are also objects • return results • Methods: descriptions of operations to be done when a message is received • Classes: templates for objects, with methods and state variables • objects are instantiations of classes • classes are also objects (have instantiation method to create new objects)
Fundamental Properties of OO Model • Abstract Data Types • encapsulation into a single syntactic unit that includes operations and variables • also information hiding capabilities • Inheritance • fundamental defining characteristic of OOPL • classes are hierarchical • subclass/superclass or parent/derived • lower in structure inherit variables and methods of ancestor classes • can redefine those, or add additional, or eliminate some • single inheritance (tree structure) or multiple inheritance (acyclic graph) • if single inheritance can talk about a root class
Fundamental Properties of OO Model (continued) • Polymorphism • special kind of dynamic binding • message to method • same message can be sent to different objects, and the object will respond properly • similar to function overloading except • overloading is static (known at compile time) • polymorphism is dynamic (class of object known at run time)
Comparison with Data Abstraction • Class == generic package • Object == instantiation of generic • actually, closer to instance of exported type • Messages == calls to operations exported by ADT • Methods == bodies (code) for operations exported by ADT • EXCEPT • data abstraction mechanism allows only one level of generic/instantiation • OO model allows multiple levels of inheritance • no dynamic binding of method invocation in ADTs
OOP Language Design Issues • Exclusivity of objects • everything is an object • elegant and pure, but slow for primitive types • add objects to complete typing system • fast for primitive types, but confusing • include an imperative style typing system for primitive types, but everything else is an object • relatively fast, and less confusion • Are subclasses subtypes? • does an “is a” relationship hold between parent and child classes?
OOP Language Design Issues (continued) • Interface or implementation inheritance? • if only interface of parent class is visible to subclass, interface inheritance • may be inefficient • if interface and implementation visible to subclass, implementation inheritance • Type checking and polymorphism • if overridding methods must have the same parameter types and return type, checking may be static • Otherwise need dynamic type checking, which is slow and delays error detection • Single or multiple inheritance • multiple is extremely convenient • multiple also makes the language and implementation more complex, and is less efficient
OOP Language Design Issues (continued) • Allocation and deallocation of objects • if all objects are allocated from heap, references to them are uniform (as in Java) • is deallocation explicit (heap-dynamic objects in C++) or implicit (Java) • Should all binding of messages to methods be dynamic? • if yes, inefficient • if none are, great loss of flexibility
Smalltalk 80 • Smalltalk is the prototypical pure OOPL • All entities in a program are objects • referenced by pointers • All computation is done by sending messages (perhaps parameterized by object names) to objects • message invokes a method • reply returns result to sender, or notifies that action has been done • Also incorporates graphical programming environment • program editor • compiler • class library browser • with associated classes • also written in Smalltalk • can be modified
Smalltalk 80 (continued) • Messages • object to receive message • message • method to invoke • possibly parameters • Unary messages • specify only object and method • firstAngle sin • invokes sin method of firstAngle object • Binary messages • infix order • total / 100 • sends message / 100 to object total • which invokes / method of total with parameter 100
Smalltalk 80 (continued) • Keyword messages • indicate parameter values by specifying keywords • keywords also identify the method • firstArray at: 1 put: 5 • invokes at:put: method of firstArray with parameters 1 and 5 • Message expressions • messages may be combined in expressions • unary have highest precedence, then binary, then keyword • associate left to right • order may be specified by parentheses • messages may be cascaded • ourPen home; up; goto: 500@500 • equivalent to ourPen home. ourPen up. ourPen goto: 500@500
Smalltalk 80 (continued) • Assignment • object <- object • index <- index + 5 • Blocks • unnamed objects specified by [ <expressions> ] • expressions are separated by . • evaluated when they are sent the value message • always in the context of their definition • may be assigned to variables • foo <- [ ... ] • Logical loops • blocks may contain conditions • all blocks have whileTrue methods • sends value to condition block • evaluates body block if result is true [ <logical condition> ] whileTrue: [ <body of loop> ]
Smalltalk 80 (continued) • Iterative loops • all integer objects have a timesRepeat method • also have • to:do: • to:by:do: • a block is the loop body • Selection • true and false are also objects • each has ifTrue:, ifFalse:, ifTrue:ifFalse:, and IfFalse:ifTrue: methods 12 timesRepeat: [ ... ] 6 to: 10 do: [ ... ] total = 0 “returns true or false object” ifTrue: [ ... ] “true object executes this; false ignores” ifFalse: [ ... ] “false object executes this; true ignores”
Smalltalk 80 (continued) • Dynamic binding • when a message arrives at an object, the class of which the object is an instance is searched for a corresponding method • if not there, search superclass, etc. • Only single inheritance • every class is an offspring of the root class Object • Evaluation • simple, consistent syntax • relatively slow • message passing overhead for all control constructs • dynamic binding of message to method • dynamic binding allows type errors to be detected only at run-time
C++ • Essentially all of variable declaration, types, and control structures are those of C • C++ classes represent an addition to type structure of C • Inheritance • multiple inheritance allowed • classes may be stand-alone • three information hiding modes • public: everyone may access • private: no one else may access • protected: class and subclasses may access • when deriving a class from a base class, specify a protection mode • public mode: public, protected, and private are retained in subclass • private mode: everything in base class is private • may reexport public members of base class
C++ (continued) • Dynamic binding • C++ member functions are statically bound unless the function definition is identified as virtual • if virtual function name is called with a pointer or reference variable with the base class type, which member function to execute must be determined at run-time • pure virtual functions are set to 0 in class header • must be redefined in derived classes • classes containing a pure virtual function can never be instantiated directly • must be derived
Java • General characteristics • all data are objects except the primitive types • all primitive types have wrapper classes that store one data value • all objects are heap-dynamic, referenced through reference variables, and most are explicitly allocated • Inheritance • single inheritance only • but implementing interface can provide some of the benefits of multiple inheritance • an interface can include only method declarations and named constants • methods can be final (can’t be overridden) public class Clock extends Applet implements Runnable
Java (continued) • Dynamic binding is the default • except for final methods • Package provides additional encapsulation mechanism • packages are a container for related classes • entries defined without access modifier (private, protected, public) has package scope • visible throughout package but not outside • similarly, protected entries are visible throughout package
Ada 95 • Type extension builds on derived types with tagged types • tag associated with type identifies particular type • Classes are packages with tagged types Package Object_Package is type Object is tagged private; procedure Draw (O: in out Object); private type Object is tagged record X_Coord, Y_Coord: Real; end record; end Object_Package;
Ada 95 (continued) • Then may derive a new class by using new reserved word and modifying tagged type exported • Overloading defines new methods with Object_Package; use Object_package; Package Circle_Package is type Circle is new Object with record radius: Real; end record; procedure Draw (C: in out Circle); end Circle_Package
Ada 95 (continued) • Derived packages form tree of classes • Can refer to type and all types beneath it in tree by type’class • Object’class • Square’class • Then use these as parameters to procedures to provide dynamic binding of procedure invocation Object Circle Square Ellipse Rectangle procedure foo (OC:Object’class) is begin Area(OC); -- which Area -- determined at -- run time end foo;
Ada 95 (continued) • Pure abstract base types are defined using the word abstract in type and subprogram definitions Package World is type Thing is abstract tagged null record; function Area(T: in Thing) return Real is abstract; end World; With World; package My_World is type Object is new Thing with record ... end record; procedure Area(O: in Object) return Real is ... end Area; type Circle is new Object with record ... end record; procedure Area(C: in Circle) return Real is ... end Area; end My_World;
Comparing C++ and Smalltalk • Inheritance • C++ provides greater flexibility of access control • C++ provides multiple inheritance • good or bad? • Dynamic vs. static binding • Smalltalk full dynamic binding with great flexibility • C++ allows programmer to control binding time • virtual functions, which all must return same type • Control • Smalltalk does everything through message passing • C++ provides conventional control structures
Comparing C++ and Smalltalk (continued) • Classes as types • C++ classes are types • all instances of a class are the same type, and one can legally access the instance variables of another • Smalltalk classes are not types, and the language is essentially typeless • C++ provides static type checking, Smalltalk does not • Efficiency • C++ substantially more efficient with run-time CPU and memory requirements • Elegance • Smalltalk is consistent, fundamentally object-oriented • C++ is a hybrid language in which compatibility with C was an essential design consideration
Comparing C++ and Ada 95 • Ada 95 has more consistent type mechanism • C++ has C type structure, plus classes • C++ provides cleaner multiple inheritance • C++ must make dynamic/static function invocation decision at time root class is defined • must be virtual function • Ada 95 allows that decision to be made at time derived class is defined • C++ allows dynamic binding only for pointers and reference types • Ada 95 doesn’t provide constructor and destructor functions • must be explicitly invoked
Comparing C++ and Java • Java more consistent with OO model • all classes must descend from Object • No friend mechanism in Java • packages provide cleaner alternative • Dynamic binding “normal” way of binding messages to methods • Java allows single inheritance only • but interfaces provide some of the same capability as multiple inheritance
Implementing OO Constructs • Store state of an object in a class instance record • template known at compile time • access instance variables by offset • subclass instantiates CIR from parent before populating local instance variables • CIR also provides a mechanism for accessing code for dynamically bound methods • CIR points to table (virtual method table) which contains pointers to code for each dynamically bound method