640 likes | 802 Views
Lecture 5. Java Collection : Built-in Data Structures for Java. Cheng-Chia Chen. The Java Collection API. Interfaces: Collection Set SortedSet, List Map SortedMap Iterator ListIterator Comparator. Summary of all interfaces in the java Collection API.
E N D
Lecture 5. Java Collection :Built-in Data Structures for Java Cheng-Chia Chen
The Java Collection API Interfaces: • Collection • Set SortedSet, • List • Map SortedMap • Iterator ListIterator • Comparator
Summary of all interfaces in the java Collection API Collection Interfaces : The primary means by which collections are manipulated. • Collection Set, List • A group of objects. • May or may not be ordered; May or may not contain duplicates. • Set SortedSet • The familiar set abstraction. • No duplicates; May or may not be ordered. • SortedSet • elements automatically sorted, either in their natural ordering (see the Comparable interface), or by a Comparator object provided when a SortedSet instance is created. • List • Ordered collection, also known as a sequence. • Duplicates permitted; Allows positional access.
Map SortedMap • A mapping from keys to values. • Each key can map to at most one value (function). • SortedMap • A map whose mappings are automatically sorted by key, either in the keys' natural ordering or by a comparator provided when a SortedMap instance is created.
Infrastructure Iterators : Similar to the Enumeration interface, but more powerful, and with improved method names. • Iterator ListIterator • functionality of the Enumeration interface, • supports elements removement from the backing collection. • ListIterator • Iterator for use with lists. • supports bi-directional iteration, element replacement, element insertion and index retrieval. Ordering • Comparable ( compareTo(object) ) • Imparts a natural ordering to classes that implement it. • The natural ordering may be used to sort a list or maintain order in a sorted set or map. Many classes have been retrofitted to implement this interface. • Comparator ( compare(Object, Object) ) • Represents an order relation, which may be used to sort a list or maintain order in a sorted set or map. Can override a type's natural ordering, or order objects of a type that does not implement the Comparable interface.
Infrastructure Runtime Exceptions • UnsupportedOperationException • Thrown by collections if an unsupported optional operation is called. • ConcurrentModificationException • Thrown by Iterator and listIterator if the backing collection is modified unexpectedly while the iteration is in progress. • Also thrown by sublist views of lists if the backing list is modified unexpectedly.
classes of the java collection API • AbstractCollection (Collection) • AbstractSet (Set) HashSet, TreeSet(SortedSet) • AbstractList (List) ArrayList, AbstractSequentialList LinkedList • AbstractMap (Map) • HashMap • TreeMap(SortedMap) • WeakHashMap • Arrays • Collections
General-Purpose Implementation classes The primary implementations of the collection interfaces. • HashSet : Hash table implementation of the Set interface. • TreeSet : Red-black tree implementation of the SortedSet interface. • ArrayList : Resizable-array implementation of the List interface. • (Essentially an unsynchronized Vector.) The best all-around implementation of the List interface. • LinkedList : Doubly-linked list implementation of the List interface. • May provide better performance than the ArrayList implementation if elements are frequently inserted or deleted within the list. • Useful for queues and double-ended queues (deques). • HashMap : Hash table implementation of the Map interface. • (Essentially an unsynchronized Hashtable that supports null keys and values.) The best all-around implementation of the Map interface. • TreeMap : Red-black tree implementation of the SortedMap interface.
The java.util.Collection interface • represents a group of objects, known as its elements. • no direct implementation • The primary use : pass around collections of objects where maximum generality is desired. Ex: List l = new ArrayList( c ) ;// c is a Collection object
The definition public interface Collection { // Basic properties int size(); boolean isEmpty(); boolean contains(Object element); // use equals() for comparison boolean equal(Object); int hashCode(); // new equals() requires new hashCode() // basic operations boolean add(Object);// Optional; return true if this changed boolean remove(Object);// Optional; use equals() (not ==)
The Collection interface definition // Bulk Operations boolean containsAll(Collection c); // true if c this // the following 4 methods are optional, returns true if contents changed boolean addAll(Collection c); // Optional; this = this U c boolean removeAll(Collection c); // Optional; this = this - c boolean retainAll(Collection c); // Optional; this = this c void clear(); // Optional; this = {}; // transformations Iterator iterator();// collection iterator Object[] toArray(); // collection array Object[] toArray(Object[] a ); // if |this| \le |a| => copy this to a and return a ; else => create a new array b // whose component type is that of a and copy this to b and return b; . }
Why using Iterators instead of Enumerations • Iterator allows the caller to remove elements from the underlying collection during the iteration with well-defined semantics. • Method names have been improved. public interface Iterator { boolean hasNext(); // cf: hasMoreElements() Object next(); // cf: nextElement() void remove(); // Optional } // a simple Collection filter using iterator that Enumeration could not help static void filter(Collection c) { for (Iterator i = c.iterator(); i.hasNext(); ) { if ( no-good( i.next() ) ) i.remove(); // i.next() is removed from c i.remove(); // exception raised, cannot removed more than once ! } }
Examples of Bulk and array operations for Collection • Remove all instances of an element e from a collection c: c.removeAll(Collections.singleton(e)); // remove all e’s in c c.removeAll(Collections.singleton(null)); // remove all nulls in c • collections to arrays : Collection c = new LinkedList(); // LinkedList is an imp. of Col. c .add(“a”); c.add(“b”); // c == [“a”, “b”]; component type = Object Object[] ar1 = c.toArray(); // ok, ar1.length == 2 String[] ar2 = (String[]) c.toArray(); // runtime exception, cannot // cast an array of Object component type to String[]; // note: can pass compilation. String[] ar3 = (String[]) c.toArray(new String[0]); // ok! since c.toArray(String[]) has String component type.
The Set Interface • is a Collection that cannot contain duplicate elements. • models the mathematical set abstraction. • contains no methods other than those inherited from Collection. • same signatures but different semantics ( meaning ) • Collection c = new LinkedList(); Set s = new HashSet(); • String o = “a”; • c.add(o); c.add(o) ; // both return true; c.size() == 2 • s.add(o); s.add(o) ; // 2nd add() returns false; s.size() == 1 • It adds the restriction that duplicate elements are prohibited. • Collection noDups = new HashSet(c);// a simple way to eliminate duplicate from c • Two Set objects are equal if they contain the same elements. • Two direct implementations: • HashSet TreeSet
Basic Operations • A simple program to detect duplicates using set: import java.util.*; public class FindDups { public static void main(String args[]) { Set s = new HashSet(); // or new TreeSet(), another implementation of Set // following code uses Set methods only for (int i=0; i<args.length; i++) if (!s.add(args[i])) System.out.println("Duplicate detected: “ + args[i]); System.out.println(s.size()+" distinct words detected: "+s); } } % java FindDups i came i saw i left Duplicate detected: i Duplicate detected: i 4 distinct words detected: [came, left, saw, i]
Bulk Operations for Set objects • s1.containsAll(s2) returns true if s2 is a subset of s1. • s1.addAll(s2), s1.retainAll(s2), s1.removeAll(s2): • s1 = s1 U s2, s1 = s1 s2, s1 = s1 – s2, respectively • return true iff s1 is modified. • For nondestructive operations: Set union = new HashSet(s1); // make a copy of s1 union.addAll(s2); Set intersection = new HashSet(s1); // may also use TreeSet(s1) intersection.retainAll(s2); Set difference = new HashSet(s1); difference.removeAll(s2);
Example • Show all arguments that occur exactly once and those that occur more than once. import java.util.*; public class FindDups2 { public static void main(String args[]) { Set uniques = new HashSet(), dups = new HashSet(); for (int i=0; i<args.length; i++) if (! uniques.add(args[i])) dups.add(args[i]); uniques.removeAll(dups); // Destructive set-difference System.out.println("Unique words: " + uniques); System.out.println("Duplicate words: " + dups); } }
The List Interface • A List is an ordered Collection(sometimes called a sequence). • may contain duplicate elements. • The List interface includes operations for: • Positional Access: set/get elements based on their numerical position in the list. • Search: search for a specified object in the list and return its numerical position. • List Iteration: extend Iterator semantics to take advantage of the list's sequential nature. • Range-view: perform arbitrary range operations on the list. • Three direct implementations: • ArrayList : resizable array • LinkedList : doubly linked-list • Vector : synchronized ArrayList.
The List interface definition public interface List extends Collection { // Positional Access Object get(int index); // 0-based Object set(int index, Object element); // Optional; return old value void add([int index,] Object element); // Optional Object remove(int index); // Optional abstract boolean addAll(int index, Collection c); // Optional // Search int indexOf(Object o); int lastIndexOf(Object o); // Range-view List subList(int from, int to); // Iteration ListIterator listIterator([int f]); // default f = 0; return a listIterator with cursor set to position f}
List in comparison with Vector (1.1) • shorter getter/setter names • get(int) // elementAt(int) / • add(int,Object) // insertElemetnAt(O,i) • Object set(int, Object) //void setElementAt(Object,int) • Object remove(int) // void removeElementAt(int) • Note: From java1.2 Vector also implements the List interface. • List concatenation: • list1 = list1.addAll(list2); // destructive • List list3 = new arrayList( list1); // or LinkedList(list1) • list3.addAll(list2); // list3 equals to list1 . list2 • Two List objects are equal if they contain the same elements in the same order. • List l1 = new LilnkedList(l2); // l2 is an ArrayList • l1.equals(l2) ? true: false // returns true, but l1==l2 returns false.
E(0) E(1) E(2) E(3) ^ ^ ^ ^ ^ Index: 0 1 2 3 4 (cursor) previous() next() ListIterator public interface ListIterator extends Iterator { // from Iterator boolean hasNext(); Object next(); // backward iteration: boolean hasPrevious(); Object previous(); int nextIndex(); // == cursor position == pos of next() object int previousIndex(); // == nextIndex() – 1 = pos of previous() object // ; == -1 if cursor = 0; void remove(); // Optional void set(Object o); // Optional void add(Object o); // Optional }
Set and Add operations in ListIterator • set(Object), remove() : Replaces/remove the last element returned by next or previous with the specified element. Ex: => either E(1) (if next() is called more recently than previous()) or E(2) (otherwise) would be replaced • add(Object): Inserts the specified element into the list, immediately before the current cursor position. • Ex: add(o) => E(0) E(1) o (cursor) E(2) E(3) • Backward Iteration: for(ListIterator i = list.listIterator(list.size()); i.hasPrevious(); ) { processing( i.previous()) ; } E(0) E(1) E(2) E(3) ^ ^ ^ ^ ^ Index: 0 1 2 3 4 (cursor)
Some Examples • possible implementation of List.indexOf(Object): public int indexOf(Object o) { for (ListIterator i = listIterator(); i.hasNext(); ) if (o==null ? i.next()==null : o.equals(i.next())) return i.previousIndex(); // or i.nextIndex() -1 return -1; // Object not found } • replace all occurrences of one specified value with another: public static void replace(List l, Object val, Object newVal) { for (ListIterator i = l.listIterator(); i.hasNext(); ) if (val==null ? i.next()==null : val.equals(i.next())) i.set(newVal); }
Range-view Operation • subList(int f, int t), returns a List view of a portion of this list whose indices range from f, inclusive, to t, exclusive, [f,t). • Ex: sublist(1,3) = E(1),E(2) • This half-open range mirrors the typical for-loop: for (int i=f; i<t; i++) { ... } // iterate over the sublist • Change on sublist is reflected on the backing list Ex: the following idiom removes a range of elements from a list: list . subList(from, to) . clear(); E(0) E(1) E(2) E(3) ^ ^ ^ ^ ^ Index: 0 1 2 3 4 (cursor)
The Map Interface • A Map is an object that maps keys to values. • A map cannot contain duplicate keys: • Each key can map to at most one value. • Three implementations: • HashMap, which stores its entries in a hash table, is the best-performing implementation. • TreeMap, which stores its entries in a red-black tree, guarantees the order of iteration. • Hashtable has also been retrofitted to implement Map. • All implementation must provide two constructors: (like Collections) Assume M is your implementation • M()// empty map • M(Map m)// a copy of map from m
The Map interface public interface Map { // Map does not extend Collection // Basic Operations // put or replace, return replaced object Object put(Object key, Object value); // optional Object get(Object key); Object remove(Object key); boolean containsKey(Object key); boolean containsValue(Object value); int size(); boolean isEmpty();
The Map interface // Bulk Operations void putAll(Map t); //optional void clear(); // optional // Collection Views; // backed by the Map, change on either will be reflected on the other. public Set keySet(); // cannot duplicate by definition!! public Collection values(); // can duplicate public Set entrySet(); // no equivalent in Dictionary // nested Interface for entrySet elements public interface Entry { Object getKey(); Object getValue(); Object setValue(Object value); } }
Possible Exceptions thrown by Map methods • UnsupportedOperationException • if the method is not supported by this map. • ClassCastException • if the class of a key or value in the specified map prevents it from being stored in this map. • ex: m.put(“Name”, new Integer(2)) // m actually put (String) value • IllegalArgumentException • some aspect of a key or value in the specified map prevents it from being stored in this map. • ex: put(“Two”, 2) // put expect an Object value • NullPointerException • this map does not permit null keys or values, and the specified key or value is null.
Basic Operations • a simple program to generate a frequency table import java.util.*; public class Freq { private static final Integer ONE = new Integer(1); public static void main(String args[]) { Map m = new HashMap(); // Initialize frequency table from command line for (int i=0; i<args.length; i++) { Integer freq = (Integer) m.get(args[i]); // key is a string m.put(args[i], (freq==null ? ONE : new Integer( freq.intValue() + 1))); } // value is Integer System.out.println( m.size()+" distinct words detected:"); System.out.println(m); } } • > java Freq if it is to be it is up to me to delegate 8 distinct words detected: {to=3, me=1, delegate=1, it=2, is=2, if=1, be=1, up=1}
Bulk Operations • clear() : removes all of the mappings from the Map. • putAll(Map) operation is the Map analogue of the Collection interface's addAll(…) operation. • can be used to create attribute map creation with default values. Here's a static factory method demonstrating this technique: • static Map newAttributeMap(Map defaults, Map overrides) { Map result = new HashMap(defaults); result.putAll(overrides); return result; }
Collection Views methods • allow a Map to be viewed as a Collection in three ways: • keySet: the Set of keys contained in the Map. • values: The Collection of values contained in the Map. This Collection is not a Set, as multiple keys can map to the same value. • entrySet: The Set of key-value pairs contained in the Map. • The Map interface provides a small nested interface called Map.Entry that is the type of the elements in this Set. • the standard idiom for iterating over the keys in a Map: for (Iterator i = m.keySet().iterator(); i.hasNext(); ) { System.out.println(i.next()); if(no-good(…)) i.remove() ; } // support removal from the back Map • Iterating over key-value pairs for (Iterator i=m.entrySet().iterator(); i.hasNext(); ) { Map.Entry e = (Map.Entry) i.next(); System.out.println(e.getKey() + ": " + e.getValue()); }
Permutation groups of words import java.util.*; import java.io.*; public class Perm { public static void main(String[] args) { int minGroupSize = Integer.parseInt(args[1]); // Read words from file and put into simulated multimap Map m = new HashMap(); try { BufferedReader in = new BufferedReader(new FileReader(args[0])); String word; while((word = in.readLine()) != null) { String alpha = alphabetize(word); // normalize word : success ccesssu List l = (List) m.get(alpha); if (l==null) m.put(alpha, l=new ArrayList()); l.add(word); } } catch(IOException e) { System.err.println(e); System.exit(1); }
// Print all permutation groups above size threshold for (Iterator i = m.values().iterator(); i.hasNext(); ) { List l = (List) i.next(); if (l.size() >= minGroupSize) System.out.println(l.size() + ": " + l); } }
// buketsort implementation private static String alphabetize(String s) { int count[] = new int[256]; int len = s.length(); for (int i=0; i<len; i++) count[s.charAt(i)]++; StringBuffer result = new StringBuffer(len); for (char c='a'; c<='z'; c++) for (int i=0; i<count[c]; i++) result.append(c); return result.toString(); } }
Some results % java Perm dictionary.txt 8 9: [estrin, inerts, insert, inters, niters, nitres, sinter, triens, trines] 8: [carets, cartes, caster, caters, crates, reacts, recast, traces] 9: [capers, crapes, escarp, pacers, parsec, recaps, scrape, secpar, spacer] 8: [ates, east, eats, etas, sate, seat, seta, teas] 12: [apers, apres, asper, pares, parse, pears, prase, presa, rapes, reaps, spare, spear] 9: [anestri, antsier, nastier, ratines, retains, retinas, retsina, stainer, stearin] 10: [least, setal, slate, stale, steal, stela, taels, tales, teals, tesla] 8: [arles, earls, lares, laser, lears, rales, reals, seral] 8: [lapse, leaps, pales, peals, pleas, salep, sepal, spale] 8: [aspers, parses, passer, prases, repass, spares, sparse, spears] 8: [earings, erasing, gainers, reagins, regains, reginas, searing, seringa] 11: [alerts, alters, artels, estral, laster, ratels, salter, slater, staler, stelar, talers] 9: [palest, palets, pastel, petals, plates, pleats, septal, staple, tepals] …
The SortedSet Interface • A SortedSet is a Set that maintains its elements in ascending order, sorted according to the elements' natural order(via comparable interface), or according to a Comparator provided at SortedSet creation time. • In addition to the normal Set operations, the SortedSet interface provides operations for: • Range-view: Performs arbitrary range operations on the sorted set. • Endpoints: Returns the first or last element in the sorted set. • Comparator access: Returns the Comparator used to sort the set (if any). • Standard constructors: Let S be the implementation • S( [ Comparator ] )// empty set • S(SortedSet)// copy set • Implementation: TreeSet
The SortedSet public interface SortedSet extends Set { // Range-view SortedSet subSet(Object f, Object t); //return [ f, t ), f.eq(t) ->null SortedSet headSet(Object t); // [first(), t ) SortedSet tailSet(Object fromElement);// [f, last() ] // Endpoints Object first(); Object last(); // Comparator access Comparator comparator(); // if any }
The SortedMap Interface • A SortedMap is a Map that maintains its entries in ascending order, sorted according to the keys‘ natural order, or according to a Comparator provided at SortedMap creation time. • In addition to the normal Map operations, the SortedMap interface provides operations for: • Range-view: Performs arbitrary range operations on the sorted map. • Endpoints: Returns the first or last key in the sorted map. • Comparator access: Returns the Comparator used to sort the map (if any). • Constructors provided by implementations: • M([Comparator])// empty SortedMap • M(SortedMap)// copy Map • Implementation: TreeMap
The SortedMap interface // analogous to SortedSet public interface SortedMap extends Map { Comparator comparator(); // range-view operations SortedMap subMap(Object fromKey, Object toKey); SortedMap headMap(Object toKey); SortedMap tailMap(Object fromKey); // member access; // Don’t forget bulk of other Map operations available Object firstKey(); Object lastKey(); } // throws NoSuchElementException if m.isEmpty()
SortedMap Operations • Operations inheriting from Map behave identically to normal maps with two exceptions: • [keySet() | entrySet() | values()] . Iterator() traverse the collections in key-order. • toArray(…) contains the keys, values, or entries in key-order. • Although not guaranteed by the interface, • the toString() method of SortedMap in all the JDK's SortedMap returns a string containing all the elements in key-order.
Example SortedMap m = new TreeMap(); m.put("Sneezy", "common cold"); m.put("Sleepy", "narcolepsy"); m.put("Grumpy", "seasonal affective disorder"); System.out.println( m.keySet() ); System.out.println( m.values() ); System.out.println( m.entrySet() ); • Running this snippet produces this output: [ Grumpy, Sleepy, Sneezy] [ seasonal affective disorder, narcolepsy, common cold] [ Grumpy=seasonal affective disorder, Sleepy=narcolepsy, Sneezy=common cold]
Actual Collection and Map Implementations • Implementations are the actual data objects used to store collections (and Maps). Three kinds of implementations: • General-purpose Implementations • the public classes that provide the primary implementations of the core interfaces. • Wrapper Implementations • used in combination with other implementations (often the general-purpose implementations) to provide added functionality. • Convenience Implementations • Convenience implementations are mini-implementations, typically made available via static factory methods that provide convenient, efficient alternativesto the general-purpose implementations for special collections (like singleton sets).
Properties of the implementations • consistent names as well as consistent behavior. • fully implementations [of all the optional operations]. • All permit null elements, keys and values. • unsynchronized. • remedy the deficiency of Hashtable and Vector • can become synchronized through the synchronization wrappers • All have fail-fast iterators, which detect illegal concurrent modification during iteration and fail quickly and cleanly. • All are Serializable, • all support a public clone() method. • should be thinking about the interfaces rather than the implementations. The choice of implementation affects only performance.
HashSet vs treeSet (and HashMap vs TreeMap) • HashSet/ HashMap is much faster (constant time vs. log time for most operations), but offers no ordering guarantees. • always use HashSet/HashMap unless you need to use the operations in the SortedSet, or in-order iteration. • choose an appropriate initial capacity of your HashSet if iteration performance is important. • The default initial capacity is 101, and that's often more than you need. • can be specified using the int constructor. • Set s= new HashSet(17);// set bucket size to 17
ArrayList vs LinkedList • Most of the time, you'll probably use ArrayList. • offers constant time positional access • Think of ArrayList as Vector without the synchronization overhead. • Use LikedList If you frequently add elements to the beginning of the List, or iterate over the List deleting elements from its interior. • These operations are constant time in a LinkedList but linear time in an ArrayList.
Wrapper Implementations • are implementations that delegate all of their real work to a specified collection, but add some extra functionality on top of what this collection offers. • These implementations are anonymous: • the JDK provides a static factory method. • All are found in the Collections class which consists solely of static methods. • Synchronization Wrappers • public static Collection synchronizedCollection(Collection c); • public static Set synchronizedSet(Set s); • public static List synchronizedList(List list); • public static Map synchronizedMap(Map m); • public static SortedSet synchronizedSortedSet(SortedSet s); • public static SortedMap synchronizedSortedMap(SortedMap m);
read-only access to Collection/Maps • Unmodifiable Wrappers • public static Collection unmodifiableCollection(Collection c); • public static Set unmodifiableSet(Set s); • public static List unmodifiableList(List list); • public static Map unmodifiableMap(Map m); • public static SortedSet unmodifiableSortedSet(SortedSet s); • public static SortedMap unmodifiableSortedMap(SortedMap m);
Convenience Implementations • mini-implementations that can be more convenient and more efficient then the general purpose implementations • available via static factory methods or exported constants in class Arrays or Collections. • List-view of an Array • List l = Arrays.asList(new Object[100]); // list of 100 null’s. • Immutable Multiple-Copy List • List l = new ArrayList(Collections.nCopies(1000, new Integer(1))); • Immutable Singleton Set • c.removeAll(Collections.singleton(e)); • profession.values().removeAll(Collections.singleton(LAWYER)); • (Immutable) Empty Set and Empty List Constants • Collections.EMPTY_SET Collections.EMPTY_LIST • Collections.EMPTY_MAP // add(“1”) UnsupportedOp… Exception
The Arrays class • static List asList(Object[] a) // a could not be of the type int[],… • Returns a fixed-size list backed by the specified array. cannot add/remove(); • static int binarySearch(Type[] a, Type key) • Searches the specified array of bytes for the specified value using the binary search algorithm. Type can be any primitive type or Object. • static boolean equals(Type[] a1, Type[] a2) • static void fill(Type[] a [,int f, int t], Type val) • Assigns the specified val to each element of the specified array of Type. • static void sort(Type[] a [, int f, int t]) • Sorts the specified range of the specified array into ascending numerical order. • static sort(Object[] a, Comparator c [,int f, int t ]) • Sorts the specified array of objects according to the order induced by the specified comparator.