610 likes | 681 Views
Welcome to CIS 068 !. Lesson 10: Data Structures. Overview. Description, Usage and Java-Implementation of Collections Lists Sets Hashing. Definition. Data Structures Definition ( www.nist.gov ):
E N D
Welcome to CIS 068 ! Lesson 10: Data Structures CIS 068
Overview • Description, Usage and Java-Implementation of • Collections • Lists • Sets • Hashing CIS 068
Definition • Data Structures • Definition (www.nist.gov): • “An organization of information, usually in memory, for better algorithmefficiency, such as queue, stack, linked list, heap, dictionary, and tree, or conceptual unity, such as the name and address of a person.” CIS 068
Efficiency • “An organization of information …for better algorithmefficiency...”: • Isn’t the efficiency of an algorithm defined by the order of magnitude O( )? CIS 068
Efficiency • Yes, but it is dependent on its implementation. CIS 068
Introduction • Data structures define the structure of a collection of data types, i.e. primitive data types or objects • The structure provides different ways to access the data • Different tasks need different ways to access the data • Different tasks need different data structures CIS 068
Introduction • Typical properties of different structures: • fixed length / variable length • access by index / access by iteration • duplicate elements allowed / not allowed CIS 068
Examples • Tasks: • Read 300 integers • Read an unknown number of integers • Read 5th element of sorted collection • Read next element of sorted collection • Merge element at 5th position into collection • Check if object is in collection CIS 068
Examples • Although you can invent any datastructure you want, there are ‘classic structures‘, providing: • Coverage of most (classic) problems • Analysis of efficience • Basic implementation in modern languages, like JAVA CIS 068
Data Structures in JAVA • Let‘s see what JAVA has to offer: CIS 068
The Collection Hierarchy • Collection: top interface, specifying requirements for all collections CIS 068
Collection Interface CIS 068
Collection Interface ! CIS 068
Iterator Interface • Purpose: • Sequential access to collection elements • Note: the so far used technique of sequentially accessing elements by sequentially indexing is not reasonable in general (why ?) ! • Methods: CIS 068
Iterator Interface • Iterator points ‘between‘ the elements of collection: 1 2 3 4 5 first position, hasNext() = true, remove() throws error Returned element Current position (after 2 calls to next() ), remove() deletes element 2 Position after next() hasNext() = false CIS 068
Iterator Interface Usage Typical usage of iterator: CIS 068
Back to Collections AbstractCollection CIS 068
AbstractCollection • Facilitates implementation of Collection interface • Providing a skeletal implementation • Implementation of a concrete class: • Provide data structure (e.g. array) • Provide access to data structure CIS 068
AbstractCollection • Concrete class must provide implementation of Iterator • To maintain ‘abstract character‘ of data in AbstractClass implemented (non abstract) methods use Iterator-methods to access data myCollection AbstractCollection implements Iterator; int[ ] data; Iterator iterator(){ return this; } hasNext(){ … } … add(){ Iterator i=iterator(); … } Clear(){ Iterator i=iterator(); … } CIS 068
Back to Collections List Interface CIS 068
List Interface • Extends the Collection Interface • Adds methods to insert and retrieve objects by their position (index) • Note: Collection Interface could NOT specify the position • A new Iterator, the ListIterator, is introduced • ListIterator extends Iterator, allowing for bidirectional traversal (previousIndex()...) CIS 068
List Interface Incorporates index ! A new Iterator Type (can move forward and backward) CIS 068
Example: Selection-Sorting a List Part 1: call to selection sort Actual implementation of List does not matter ! Call to SelectionSort Use only Iterator-properties of ListIterator (upcasting) CIS 068
Example: Selection-Sorting a List Part 2: Selection sort access at index ‘fill‘ Inner loop swap CIS 068
Back to Collections AbstractList: ...again the implementation of some methods... Note: Still ABSTRACT ! CIS 068
Concrete Lists ArrayList and Vector: at last concrete implementations ! CIS 068
ArrayList and Vector • Vector: • For compatibility reasons (only) • Use ArrayList • ArrayList: • Underlying DataStructure is Array • List-Properties add advantage over Array: • Size can grow and shrink • Elements can be inserted and removed in the middle CIS 068
Collections • The underlying array-datastructure has • advantages for index-based access • disadvantages for insertion / removal of middle elements (copy), insertion/removal with O(n) • Alternative: linked lists CIS 068
Linked List • Flexible structure, providing • Insertion and removal from any place in O(1), compared to O(n) for array-based list • Sequential access • Random access at O(n), compared to O(1) for array-based list CIS 068
Linked List • List of dynamically allocated nodes • Nodes arranged into a linked structure • Data Structure ‘node‘ must provide • Data itself (example: the bead-body) • A possible link to another node (ex.: the link) Children’s pop-beads as an example for a linked list CIS 068
Linked List New node Old node next next (null) CIS 068
Connecting Nodes creating the nodes connecting CIS 068
Inserting Nodes r • p.link = r • r.link = q • q can be accessed by p.link.link CIS 068
Removing Nodes p q CIS 068
Traversing a List (null) CIS 068
Double Linked Lists Single linked list Double linked list (null) (null) data data data (null) successor successor successor predecessor predecessor predecessor (null) CIS 068
Back to Collections AbstractSequentialList and LinkedList CIS 068
LinkedList An implementation example: See textbook CIS 068
Sets Example task: Examine, collection contains object o Solution using a List: -> O(n) operation ! CIS 068
Sets • Comparison to List: • Set is designed to overcome the limitation of O(n) • Contains unique elements • contains() / remove() operate in O(1) or O(log n) • No get() method, no index-access... • ...but iterator can (still) be used to traverse set CIS 068
Back to Collections Interface Set CIS 068
Hashing How can method ‘contain()‘ be implemented to be an O(1) operation ? http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/hash_tables.html CIS 068
Hashing • How can method ‘contain()‘ be implemented to be an O(1) operation ? • Idea: • Retrieving an object of an array can be done in O(1) if the index is known • Determine the index to store and retrieve an object by the object itself ! CIS 068
Hashing • Determine the index ... by the object itself: • Example: • Store Strings “Apu“, “Bob“, “Daria“ as Set. • Define function H: String -> integer: • Take first character, A=1, B=2,... • Store names in String array at position H(name) CIS 068
Hashing Apu: first character: A H(A) = 1 Bob: first character: B H(B) = 2 Daria: first character: D H(D) = 4 ... CIS 068
Hashing • The Function H(o) is called the HashCode of the object o • Properties of a hashcode function: • If a.equals(b) then H(a) = H(b) • BUT NOT NECESSARILY VICE VERSA: • H(a) = H(b) does NOT guarantee a.equals(b) ! • If H() has ‘sufficient variation‘, then it is most likely, that different objects have different hashcodes CIS 068
Hashing • Additionally an array is needed, that has sufficient space to contain at least all elements. • The hashcode may not address an index outside the array, this can easily be achieved by: • H1(o) = H(o) % n • % = modulo-function, n = array length • The larger the array, the more variates H1() ! CIS 068