CS 240: Data Structures

CS 240: Data Structures Monday, June 16th Lists – Array based, Link based Dynamic vs Static

What is a primitive? • A primitive is a built-in data type. • Generally, it has complete functionality. • They have a value and are located in memory. • During compilation, they have a name. • Only a memory address is used while running. Primitive Type: int Value (32 bits): ? Address: ? Primitive Type: char Value (8 bits): ? Address: ?

Data Manipulation • Therefore, we use the address of the data to locate the value we want. • How about an array of ints? • Ok, so if we know where the array starts we can find successive data.

Data Manipulation • An array is a form of a data abstraction. class array { T value; memory_address next_data; memory_address own_location; }; Where next_data is always equal to the size of T. This is done automatically when we use array implementation (this abstraction only works in theory since the next_location would be offset by the other class members). If you created an array of this time (and set next_data to size of array instead of T), it would mirror the real implementation.

Data Manipulation • These abstractions allow us to group data together so that we access data we can get more than 1 piece of information. Address: ? ADT: array (int) 96 bits (32 bits) -> int (32 bits) -> 12 (32 bits) -> *this Primitive Type: int Value (32 bits): ? Address: ?

array testdata[3]; Address: X Address: X + 12 Address: X + 24 ADT: array (int) 96 bits (32 bits) -> int (5) (32 bits) -> 12 (32 bits) -> *this (X) ADT: array (int) 96 bits (32 bits) -> int (10) (32 bits) -> 12 (32 bits) -> *this (X+12) ADT: array (int) 96 bits (32 bits) -> int (15) (32 bits) -> 12 (32 bits) -> *this (X+24) This actually requires array to have a constructor to set: next_data = sizeof(array); own_location = *this; Therefore: &testdata[0] = &testdata[0].own_location; &testdata[1] = &testdata[1].own_location; and &testdata[1] = &testdata[0].own_location + testdata[0].next_data; This is a representation of how it is actually done in memory.

Data Manipulation • You wouldn’t implement an array like we did in the last slide. However, we using it slightly differently we can achieve a differently goal.

array testdata[3]; Address: X Address: X + 12 Address: X + 24 ADT: array (int) 96 bits (32 bits) -> int (5) (32 bits) -> 24 (32 bits) -> *this (X) ADT: array (int) 96 bits (32 bits) -> int (10) (32 bits) -> 12 (32 bits) -> *this (X+12) ADT: array (int) 96 bits (32 bits) -> int (15) (32 bits) -> -12 (32 bits) -> *this (X+24) If we use [ ], we will get all the data in the same order as before: 5 10 15 However, if we use next_data: 5 15 10 We now have a new representation – an array-based linked list

We don’t need an array… Address: X Address: Y Address: Z ADT: array (int) 96 bits (32 bits) -> int (5) (32 bits) -> Z-X (32 bits) -> *this (X) ADT: array (int) 96 bits (32 bits) -> int (10) (32 bits) -> -Y (32 bits) -> *this (Y) ADT: array (int) 96 bits (32 bits) -> int (15) (32 bits) -> Y-Z (32 bits) -> *this (Z) In our assignments we won’t use “our_location” We always know where the first item is and can find the remaining items by using next_item.

Without our_location Address: X Address: Y Address: Z ADT: array (int) 96 bits (32 bits) -> int (5) (32 bits) -> Z ADT: array (int) 96 bits (32 bits) -> int (10) (32 bits) -> 0 ADT: array (int) 96 bits (32 bits) -> int (15) (32 bits) -> Y If our first item is at address X then, (5) we find the second item by using next_item (15) and the third item by using next_item of the second (10) and since third_item.next_item == 0, we are done.

Lists • Here is a “random access” list, in array form

Lists • Here is a “traditional” list, in array form • Wait…. Where do we start? • We have to maintain that data separately. • We usually refer to it as “first” • first = i+20

Implications • How is the data being accessed? • ??? • These are perfect candidates to use with pointers.

What is a list? • A list is a container class • Like an array • However… • The list includes: • 1) The address of the first piece of data • 2) The data • 3) For each data, the address of the next piece.

List Data Since each data is attached to the location of the next piece we can represent them as an ADT: We generally refer to this as a Node class Node { T thedata; //Remember, T can be any type memory_address next_data; Node * next_data; }; Ok, Node has our data. What about “memory_address”? Well, Node tells us where the next Node is…. Therefore, memory_address is really a Node pointer.

Floating Nodes • Now we have a node (instead of separate data): Address: ? Address: ? Type: Unknown (T) Value (X bits): ? ADT: Node Size: X+32 bits thedata: (X bits) -> T next_node (32 bits) -> Node * Address: ? Type: Node * Value (32 bits): ? Bonus: Instead of having to keep track of two pieces of data, we keep track of one! If we know where the node is, we know where the data and next_node pointer are!

Floating Nodes • Ok, we can create a node now. • First, we should decide what kind of data node will hold. • Later, we will make it possible for node to hold anything (a couple of weeks from now). Address: 0xABCD ADT: Node Size: X+32 bits thedata: (X bits) -> T next_node (32 bits) -> Node *

Floating Nodes • Strings sound good. • For ease, we will directly access the elements of Node in these slides. We can also write methods to place the data and change the pointers. Address: 0xABCD ADT: Node Size: X+32 bits thedata: (32 bits) -> String next_node (32 bits) -> Node *

Floating Nodes class Node { public: string thedata; Node * next_data; }; Address: 0xABCD ADT: Node Size: X+32 bits thedata: (32 bits) -> String next_node (32 bits) -> Node *

Creating a Node • Remember, we need to know where the first node in our list is. • Therefore: “Node * first = new Node();” • Remember to delete it when you are done! first - Address: 0x50F4 ADT: Node Size: X+32 bits thedata: (32 bits) -> String next_node (32 bits) -> Node *

Storing Data • For some string we’ll call “userinput” with value “apple” • first->thedata = userinput; • first->next_node = NULL; first - Address: 0x50F4 ADT: Node Size: X+32 bits thedata: (32 bits) -> String next_node (32 bits) -> Node * ADT: Node Size: X+32 bits thedata: (32 bits) -> String “apple” next_node (32 bits) -> Node * NULL

Adding Data • How do we add data? • Well, we have to find an empty location in our list. • Access the list and search for an empty location!. first - Address: 0x50F4 ADT: Node Size: X+32 bits thedata: (32 bits) -> String “apple” next_node (32 bits) -> Node * NULL

Adding Data • Let’s add “donut”. • first->next_node = new Node(); • Node * accessptr = first->next_node; first - Address: 0x50F4 ??? - Address: 0x846c ADT: Node Size: X+32 bits thedata: (32 bits) -> String “apple” next_node (32 bits) -> Node * NULL ADT: Node Size: X+32 bits thedata: (32 bits) -> String “apple” next_node (32 bits) -> Node * 0x846c ADT: Node Size: X+32 bits thedata: (32 bits) -> String Uninitialized next_node (32 bits) -> Node * Uninitialized

Adding Data • Node * accessptr = first->next_node; • accessptr->thedata = “donut”; • accessptr->next_node = NULL; first - Address: 0x50F4 ??? - Address: 0x846c ADT: Node Size: X+32 bits thedata: (32 bits) -> String “apple” next_node (32 bits) -> Node * 0x846c ADT: Node Size: X+32 bits thedata: (32 bits) -> String Uninitialized next_node (32 bits) -> Node * Uninitialized ADT: Node Size: X+32 bits thedata: (32 bits) -> String “donut” next_node (32 bits) -> Node * NULL

This can get much larger first: 0x50F4 0x3120 0x8458 ADT: Node Size: X+32 bits thedata: “apple” next_node: 0x846c ADT: Node Size: X+32 bits thedata: “cashew” next_node: 0x4278 ADT: Node Size: X+32 bits thedata: “hat” next_node: NULL Organizing this will allow us to make something useful! 0x4278 0x846c 0x5610 ADT: Node Size: X+32 bits thedata: “tomato” next_node: 0x5610 ADT: Node Size: X+32 bits thedata: “donut” next_node: 0x3120 ADT: Node Size: X+32 bits thedata: “banana” next_node: 0x8458

Why lists? • So far, all we have done is manage items using contiguous memory. • If we needed more room, we would ask for it and copy all of our data into the larger space. Movin on up!

Contiguous Memory • So, what’s the big deal? • Isn’t moving up good? • Of course! But it is expensive! • We don’t want to try to hold everything! It gets messy!

The Collector • Using our “mycontainer”, our CDs wouldn’t be very organized. • If we organize when we insert….

Insertion Issues • When can insertion be a problem? • Inserting at the end of the mycontainer is easy! • Inserting in the middle isn’t too bad! • Inserting at the front…. • With a list, we can insert easily!

Let us represent this list: first: 0x50F4 0x3120 0x8458 ADT: Node Size: X+32 bits thedata: “apple” next_node: 0x846c ADT: Node Size: X+32 bits thedata: “cashew” next_node: 0x4278 ADT: Node Size: X+32 bits thedata: “hat” next_node: NULL 0x4278 0x846c 0x5610 ADT: Node Size: X+32 bits thedata: “tomato” next_node: 0x5610 ADT: Node Size: X+32 bits thedata: “donut” next_node: 0x3120 ADT: Node Size: X+32 bits thedata: “banana” next_node: 0x8458

Nodes • A node’s pointer nodes to 1 of 2 places: • To another node • To Null • We can additional pointers in our Node and maintain additional data for various reasons.

List Construction • To create a list: • We need to create a node pointer (which points to Null) • Is that it? • Well, we need functions to act on the list. • Some things – like size – may be easier to manage as part of the list.

List Insertion • Inserting into the list has two cases: • The list was empty: • The list was not empty: • However, our insertion policy may make this more difficult: • Just insert • Insert in order • This technically doesn’t add another case

List Removal • If the value is in the list, removing a value requires that we ensure the list is still valid: Start 56 60 72 80 90 Null Now, lets remove 80. Ok, let’s traverse the list! That’s not good.

List Removal • Let’s try this again. Start 56 60 72 80 90 Null Now, lets remove 80. Ok, let’s traverse the list! He made it!

CS 240: Data Structures