320 likes | 473 Views
Data structures and algorithms in the collection framework. Elements of the collection framework. Interfaces: Abstract data types Specifications Classes: Data types Implementations Exceptions: A few exceptions Algorithms Working on the abstract data types. Collection
E N D
Data structures and algorithms in the collection framework Data structures and algorithms in the collection framework
Elements of the collection framework • Interfaces: Abstract data types • Specifications • Classes: Data types • Implementations • Exceptions: A few exceptions • Algorithms • Working on the abstract data types. Data structures and algorithms in the collection framework
Collection Methods common to all collections (except map) List An ordered collection. Set No duplicate elements. equals(Object obj) method on the elements is important. Queue Normally implemented as FIFO (first-in-first-out) New in Java 5.0 SortedSet Set that guarantees that elements are traversed in order Map Contains pairs (key, value) Values are retrieved by the key: value = map.get(key) SortedMap Map that guarantees that elements are sorted according to the key. Interfaces Data structures and algorithms in the collection framework
Object ordering • SortedSet and SortedMap rely on object ordering. • The elements in SortedSet (keys in SortedMap) must implement the Comparable interface • public interface Comparable { public int compareTo(Object o); } • Or, you must specify a Comparator when constructing the Sorted collection • public interface Comparator { int compare(Object o1, Object o2); } Data structures and algorithms in the collection framework
Some implementations • List • LinkedList, ArrayList • Vector (old implementation) • Set • HashSet, TreeSet (TreeSet is an OrderedSet) • Queue • LinkedList, PriorityQueue • Map • HashMap, TreeMap (TreeMap is an OrderedMap) • Hashtable (old implementation) • The above shows only a few implementations. More (special purpose) implementations exist in Java 5.0. Data structures and algorithms in the collection framework
Implementations overview Data structures and algorithms in the collection framework
List implementations • LinkedList • Performance: Delete operation is fast • O(1) if you are iterating through the List • ArrayList • Performance: Good get(index) O(1) • Generally faster than LinkedList • Tuning parameter: Initial capacity of array. • Set in constructor. • Vector • Old implementation (before the collection framework was introduced in Java) • Has been retrofitted to suite the collection framework. • Synchronized Data structures and algorithms in the collection framework
HashSet Is fastest Tuning parameter: Initial capacity Set in constructor TreeSet Offers ordering HashMap Fastest Tuning parameter: Initial capacity Set in constructor TreeMap Offers ordering by key Set and Map implementations Data structures and algorithms in the collection framework
Queue implementations • LinkedList • implements the Queue interface • PriorityQueue • Orders elements according to the natural order (Comparable) or a specific Comparator. • Head of queue is the “smallest” element. • Internal data structure is a “heap”. Data structures and algorithms in the collection framework
Iterator • You want to access the individual data items in your collections • List: no problem • get(int index) • Collection and Set has no order • Get-by-index does not make sense • You access the data items using an iterator Collection<Integer> set = new HashSet<Integer>(); Iterator<Integer> iterator = set.iterator(); while (iterator.hasNext()) { Integer i = iterator.next(); doSomething(i); } Data structures and algorithms in the collection framework
Iterator pattern • Iterator is an interface • You don’t need to know the name the implementing class, since you will never need “new SomeIterator()”. • Design pattern: Iterator • Iterator is such a general phenomenon that is has been termed a “design pattern” • Collection creates an iterator (known only by interface) • The iterator fetches the individual objects from the collection. • You may use the iterator pattern with you own classes • Idea: BorrowerCatalog creates a BorrowerIterator Data structures and algorithms in the collection framework
For-each loop in Java 5.0 • For-each loop in Java 5.0 • List<String> myList = new ArrayList<String>(); • for (String str : myList) { doSomething(str); } • You probably don’t need to use iterators very often. • Works with any class implementing interface java.lang.Iterable. • Your collection/catalog classes might implement Iterable. • Works with arrays as well • int[] numbers = new int[10]; • for (int number : numbers) { doSomething(number); } • For-each loop is just syntactic sugar • It adds nothing new to the Java language, it just makes things a little easier for the programmer (and program reader). Data structures and algorithms in the collection framework
Iterating a map • The general Map interface cannot create an iterator. • If you want to iterate a map, you must get a set of keys, and then iterate the keys Set<Integer> keys = map.keySet(); for (int key : keys) { System.out.println(key + ": " + map.get(key)); } Or use the entry set of the map For (Map.Entry<String, String> entry : map.entrySet()) { System.out.println(“entry.getKey + “: “ + entry.getValue(); } Data structures and algorithms in the collection framework
Wrapper implementations • The “wrapper” idea • Goals: Extend the functionality of an object transparent to its clients (i.e. the users of the class) • Two kinds of wrappers in the collection framework • Synchronized wrappers • Unmodifiablewrappers Data structures and algorithms in the collection framework
Synchronized wrappers • Purpose • Collection implementations are generally unsynchronized • Methods are not synchronized • Exceptions to the “rule”: Vector, HashTable (old implementations) • Synchronized collections can be made using methods in the class Collections. • Static methods in class Collections • Collection synchronizedCollection( Collection c ) • List synchronizedList( List l ) • Set synchronizedSet( Set s ) • Map synchronizedMap( Map m ) Data structures and algorithms in the collection framework
Unmodifiable wrappers • Purpose • Collections are generally modifiable • They have methods like add, remove, etc. • Unmodifiable collections can be made using methods in the class Collections • If you attempt to call modifying methods like add, remove, etc. you will get an UnsupportedOperationException. • Not very good object-oriented design, but it works. • Static methods in class Collections • unmodifiableCollection( Collection c ) • unmodifiableList( List l ) • unmodifiableSet( Set s ) • unmodifiableMap( Map m ) Data structures and algorithms in the collection framework
Decorator pattern • Also know as “wrapper pattern”. • Synchronized wrapper and unmodifiable wrapper are applications of the decorator pattern. • Idea • Extend the functionality of an object transparent to its clients (i.e. the users of the class) • How • A class extends another class and at the same time aggregates an object of the super class. • Methods in the subclass does something special and call the super class’ method for the basic work. • Alternative to (lots of) inheritance • Used intensively in java.io • BufferedReader wraps any Reader • Providing buffering, for speed. Data structures and algorithms in the collection framework
Layered implementation • Interface • a specification • Abstract class • a partly implementation of a specification • Used by one or more [concrete] classes • [Concrete] class • Implementation of a specification • Example • List: Interface • AbstractList: Partial implementation • ArrayList and LinkedList: [Concrete] implementations Data structures and algorithms in the collection framework
Custom implementations • Reasons for writing a custom implementation • Persistency • You want the collection to reside on the hard disk (file or database), not just in main memory. • High performance, special purpose • You want a fast implementation for some special purpose. • Use an abstract class (partial implementation) if possible • It will do most of the work. Data structures and algorithms in the collection framework
Algorithms • Class Collections and class Arrays • Not to be confused with the interface Collection • Some static methods • Sorting lists • Searching ordered lists • Finding extreme values (max, min) Data structures and algorithms in the collection framework
Sorting lists • Works with lists • Not general collections, since they have no notion of a sequence. • Algorithm: Merge sort • Fast: n*log(n) guaranteed, even faster on nearly sorted lists. • Stable: Doesn’t reorder equal elements. • Idea: Divide and conquer + recursion • Divide the list in 2 sub-lists and sort the sub-lists. • Conquer: Merge the small lists. • http://www.codecodex.com/wiki/index.php?title=Merge_sort • Methods • Collections.sort( List l ) • Collections.sort( List l, Comparator c ) • If you want to use your own comparator, not the natural order. Data structures and algorithms in the collection framework
Searching ordered lists • Works with ordered lists • Not general lists • If the list is not ordered, use Collections.sort( List l ) before searching • Not general collections • Algorithm: Binary search • Speed • O( log n ) for random access list implementations (like ArrayList) • O( n ) for iterator-based list implementations (like LinkedList) • Idea: Divide-and-conquer + recursion • Find the middle element. • If middle element < searchingFor • Search in the left hand part of the list • Else • Search the right hand part of the list • Methods • Collections.binarySearch( List l, Object searchingFor ) • Collections.binarySearch( List l, Object searchingFor, Comparator c ) • If the list is not ordered by the natural order, but by the specified comparator. Data structures and algorithms in the collection framework
Finding extreme values • Finds min, max in any collection • Algorithm is iterator-based. • If the collection is known to be ordered don’t used min, max. Simply call get(0) or get( size() ) • Methods in Collections • Object min( Collection c ) • Object max( Collection c ) • Object min( Collection c, Comparator c ) • Object max( Collection c, Comparator c ) • If you prefer your own comparator to the natural order. Data structures and algorithms in the collection framework
Arrays • Algorithms similar to Collections • Working on arrays of different types • int, double, etc. and Object • Convenience methods • List view of arrays • List list = Arrays.asList( someArray ) • Useful for printing an array • System.out.println( list ) • System.out.println( Arrays.asList(someArray) ) • Java 5.0 System.out.println( Arrays.toString(someArray) ) Data structures and algorithms in the collection framework
The term “framework” • The term “framework” denotes a set of classes that can be extended … • Examples • Collections framework • You can extend the framework creating your own implementations • Swing • You can extends the framework extending JPanel and many other classes. • java.io • Some frameworks can be used as is – others need custom extensions before they are useful. Data structures and algorithms in the collection framework
Hashing • Binary search is O( log n ) • We want something better: O(1) • Idea: • Compute a number (called “hash value”) from the data we are searching for. • Use the hash value as an index in an array (called “hash table”) • Every element in the array holds a “bucket” of elements. • If every bucket holds few elements (preferably 1) then hashing is O(1) Data structures and algorithms in the collection framework
Hash function • A good hash function should distribute elements evenly in the hash table • The worst hash function always returns 0 • Example • Hash table with 10 slots • hash( int i ) { return i % 10; } • % is the remainder operator. • More generally • Hash table with N slots • hash( T t ) { return operation( t ) % N; } • The operation should be fast and distribute elements well. • Java, class Object • int hashCode() is a hash function • Your classes should override hashCode() • hashCode() and equals() • a.equals(b) is true ⇒ a.hashCode() == b.hashCode() • a.hashCode() == b.hashCode() ⇒ a.equals(b) is true not necessarily! • a.hashCode() != b.hashCode() ⇒ a.equals(b) is false Data structures and algorithms in the collection framework
Hash table • A hash table is basically an array. • What if 2 elements computes the same hash value (i.e. same array index)? • Two solutions • Linear probing: 1 element in every array cell • Try the next empty slot in the hash table • Mark slots as “here was once an element” • Chaining: A list of elements in every array cell • Add the element to the list • In any case searching can degenerate if the hash function does not distribute elements evenly. • Problem • If a hash table is almost full searching degenerates • Solution • Rehashing: Create a larger hash table + update hash function + move elements to new hash table. Data structures and algorithms in the collection framework
Binary search tree • Basic tree terms • Node, descendant, root, leaf • A tree has 1 root • Binary tree • A node has at most 2 (bi-) descendants. • Search tree • Nodes are ordered. Small values to the left and large values to the right. • Makes searching fast. Data structures and algorithms in the collection framework
Balanced search trees • A binary search tree might degenerate into a list. • Searching is no longer fast • We want the search tree to be balanced. • Without having to completely reorganize the tree at every insert / delete. Data structures and algorithms in the collection framework
References • Josh BlochThe Java Tutorial • Trial Collections • http://download.oracle.com/javase/tutorial/collections/index.html • Naftalin & WadlerJava Generics and Collections, O’Reilly 2006 • Page 151-254 is about collections Data structures and algorithms in the collection framework