390 likes | 406 Views
Learn about data abstraction in software engineering, its benefits, implementation, and advantages/disadvantages, using case studies and examples from University of Virginia's Computer Science program. Explore specifying abstract data types and maintaining programs efficiently.
E N D
Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu
Overview • Data abstraction • Specification/Design of Abstract Data Types (ADTs) • Implementation of ADTs
The Problem • Programs are complex. • Windows XP: ~45 million lines of code • Mathematica: over 1.5 million • Abstraction helps • Many-to-one – “forget the details” • Must separate “what” from “how”
Information Hiding • Modularity - Procedural abstraction • By specification • Locality • Modifiability • By parameterization • Data Abstraction • What you can do with the data is separated from how it is represented
Software development cycle • Specifications – What do you want to do? • Design – How will you do what you want? • Implement – Code it. • Test – Check if it works. • Maintain – School projects don’t usually make it this far. Bugs are cheaper earlier in the cycle!
Database Implementation • Database on library web-server stores information on users: userID, name, email, etc. • You are responsible for implementing the interface between the web-server and database • What happens when we ask for the email address for a specific user?
Client asks for email address What is email address of nate? Server Database Client
Client/Server/Database Interaction I need Nate’s email. Server The interaction between the server and database is your part. Database Client
Client/Server/Database Interaction nate@virginia.edu Server Database Client
Client/Server/Database Interaction nate@virginia.edu Server Database Client
Example: Database System • Need a new data type • Abstract Data Types (ADTs) • Help separate what from how • Client will use the specifications for interaction with data • Client of the web database should not know the “guts” of the implementation
Data abstraction in Java • An ADT is defined by a class • The ADT in the web/database application will be a User • A private instance variable hides the class internals • public String getEmail (); • What is private in the implementation? • OVERVIEW, EFFECTS, MODIFIES • A class does not provide data abstraction by itself
Accessibility Class User { // OVERVIEW: // mutable object // where the User // is a library // member. public String email; … } /* Client code using a User object, myUser */ String nateEmail = myUser.email; sendEmail(nateEmail); /* The client’s code can only see what is made public in the User class. The user’s email data is public in the User class. This is BAD. */
Program Maintenance • Suppose storage space is at a premium • Everyone in the database is userid@virginia.edu, so we can drop the virginia.edu nate@virginia.edu nate • What kind of problems will occur with the code just seen?
Program Maintenance • Suppose storage space is at a premium • Everyone in the database is userid@virginia.edu, so we can drop the virginia.edu nate@virginia.edu nate • What kind of problems could occur had the client code been able to access the email address directly? Email was public in User class. String nateEmail = myUser.email; sendEmail(nateEmail); ***ERROR!!!***
Accessibility (fixed) Class User { // OVERVIEW: A // mutable object where // User is a library // member. private String email; … public String getEmail() { // EFFECTS: returns user’s // primary email return email; } } // Client code using a User object, myUser String nateEmail = myUser.getEmail(); sendEmail(nateEmail); /* This code properly uses data abstraction when returning the full email address. */
Accessibility (fixed) Class User { // OVERVIEW: A // mutable object where // User is a library // member. private String email; … public String getEmail() { // EFFECTS: returns user’s // primary email return email +“@virginia.edu”; } } // Client code using a User object, myUser String nateEmail = myUser.getEmail(); sendEmail(nateEmail); /* The database dropped the @virginia.edu, and only one line of code needed changing. */
Advantages/Disadvantages ofData Abstraction? - More code to write and maintain initially • Overhead of calling a method • Greater initial time investment + Client doesn’t need to know about representation + Maintenance is easier. + Increases locality and modifiability
Bad Users at the Library • The library now wants to crack down on bad Users with overdue books, so the code will need to work with a group of Users. • What should be used to represent the group? What data structures do we know about? How should we integrate this code with what we have? • What operations should be supported? • deleteUser(String userID); • isInGroup(String userID);
Library keeping track of “bad” people • You need to write some code that will manipulate a group of Users that are on the “bad” list. • Implementation at right uses an array Class GroupUsers { // OVERVIEW: // Operations provided // to manage a mutable group // of users private User [] latePeople; … public void toString() { // OVERVIEW: Print user // names to standard output … } }
Array implementation initialization for GroupUsers Class GroupUsers { // OVERVIEW: Unbounded, mutable // group of Users private User [] latePeople; … public void GroupUsers(String [ ] userIDs) { // OVERVIEW: Initialize group // from userIDs latePeople = new User[userIDs.length + 10]; for(int i = 0; i < userIDs.length; i++) { latePeople[i] = new User(userIDs[i]); } } }
ADT design • Mutable/Immutable ADTs • Mutable – object’s fields or values change • Immutable – object’s fields permanently set at creation • Is this being modified? • Tradeoffs • Immutability simpler and safer • Immutability is slower (creation/deletion of objects)
Classification of ADT operations • Creator (constructor) • GroupUsers(String userIDs[ ]) • Producer • addUser(String userID) • Mutator • setUserEmail(String email) • Observer • isMember (String userID)
A bad implementation • Most common characteristics • Modifying implementation forces other code to be changed (violdates modifiability) • Must understand more code than necessary to reason about code (violates locality) • Maintenance is difficult
A good implementation • User class needed a way to store state of a user, so operations will build around the stored state. • Methods should be (procedure abstraction): • Easily coded as possible • Efficient • Exhibit locality • Should enable better testing, maintenance
Changing the group implementation • The “guts” of the implementation is subject to change. • What happens on the GroupUser’s deleteUser(String userID)?
deleteUser(String userID) • The array must shift down an average of n/2 items when deleting an element X <user> <user> <user> <user> <user> <user> <user> <user>
Head User 1 User 2 User 3 Linked ListsA new data structure Each User has its own representation, but we store the collection in a list. In the following implementation, each user object is contained in a Node object. X
User 1 User 2 latePeople List-node implementation class Node { // OVERVIEW: // Mutable nodes that is used for a linked list // of users private User theUser; private Node next; … } next points to the next “bad” user …
List implementation class GroupUsers { // OVERVIEW: // Mutable, unbounded group of users private Node latePeople; /* head of list */ private int numUsers; … } /* Nodes are users with an additional member field called next. The Node class was added, so the User class would not need modification. */
Adding a user into GroupUsers /* in GroupUsers.java */ public void addUser(User newUser) { // MODIFIES: this // EFFECTS: this_pre = this_pre U { (Node)newUser } latePeople.add(new Node(newUser)); numUsers++; }
Adding a node into a group of nodes (Node.java) public void add (Node n) { // MODIFIES: this // EFFECTS: n is inserted just after this in the list // first user in list? if (this.next == null) { this.next = n; } else { n.next = this.next; this.next = n; } }
Head Head User 1 User 1 User 2 User 3 User 3 deleteUser(String userID) cont. X X X
deleteUser(String userID)Node.java public void delete (String userID) { // MODIFIES: this // EFFECTS: this_pre = this_pre – node // where node.userID = userID Node currNode; Node prevNode; if(this.next == null) return; prevNode = this; currNode = this.next; // continued on next slide
deleteUser(String userID)cont. while(currNode.next != null) { if(userID.equals(currNode.getUserID())) { prevNode.next = currNode.next; break; } currNode = currNode.next; prevNode = prevNode.next; } // user at end of list? if (currNode.next == null && userID.equals(currNode.getUserID())) { prevNode.next = null; } }
Linked List vs. Array • Array is better for: • Accessing a randomly desired element • Linked list is better at: • Inserting • Deleting • Dynamic resizing • Users of your implementation may need to use a list or an array for efficiency, so you need an implementation that can be changed easily.