440 likes | 542 Views
CSC 213 – Large Scale Programming. Lecture 10: Searching & Mapping. Today’s Goal. Consider the basics of searchable data How do we search using a computer? What are our goals while searching? ADTs used for search & how would they work? Most critically, where the $&*#%$# are my keys?
E N D
CSC 213 – Large Scale Programming Lecture 10:Searching& Mapping
Today’s Goal • Consider the basics of searchable data • How do we search using a computer? • What are our goals while searching? • ADTs used for search & how would they work? • Most critically, where the $&*#%$# are my keys? • How do Map & Dictionary ADT work and search? • Methods to add, remove, and access data? • How Sequenceused to implemented these • When & why would we use Sequence-based approach
Searching • Search for unknown data in most cases • Consider the reverse: why search for what you have? • Seek data related to terms used • Already have idea, want web pages containing terms • Get encoded proteins given set of amino acids • Given “borrowed” credit cards, get credit limits • Exacting, but boring, work doing these searches • Make this work ideal for computers & students
Map-Based Bartender No problem. I’ll have aManhattan ¾ oz sweet vermouth2½ oz bourbon 1 dash bitters1 maraschino cherry1 twist orange peel
Map-Based Bartender That’ll be $2 billion I’ll have aManhattan
Map-Based Bartender I’ll have aManhattan key value
Search Terms • Keygets valuables • We already have key • Want valueas a result of this • Mapworks similarly • Give it keyvaluereturned • Uses Entryto do this work
Entry Interface • Need a key to get valuables • key used to search – it is what we already have • What we want is the result of search – value interface Entry<K,V> { K key();V value(); }
Map Method Madness, Mmmm… • Describes a searchable Collection • put(K key, V value)adds data as an Entry • remove(K key)removes Entry containing key • get(K key)returns valueassociated with key • Several Iterablemethods are also defined • Methods to use are entries(), keys(), & values() • Iterates over expected data so can use in for(-each) loops • Also defines usual Collectionmethods • isEmpty() & size()
Searching Through a Map • Map is a Collection of key-valuepairs • Give it key& get value in return from ADT • Now we have ADT to work with searchable data • Many searches unsuccessful • Unsuccessful search is normal, not unusual • Expected events should NOTthrow exceptions • This is normal; return null when nothing found
At Most 1 Value Per Key • Entrys have unique keys in a Map • If key exists, put(key,value)replaces existing Entry • Returns prior value forkey in the Map so its not lost • If before call key not in Map, null returned
Sequence-Based Map • Sequence’s perspective of Mapthat it holds Positions elements
Sequence-Based Map • Outside view of Map and how it is stored Positions Entrys
Using a Map • Mapgreat when want only one value for a key • Credit card number goes to one account • One person has a given social security number • One definition per word in the dictionary
Using a Map • Mapgreat when want only one value for a key • Credit card number goes to one account • One person has a given social security number • One definition per word in the dictionary
Using a Map • Mapgreat when want only one value for a key • Credit card number goes to one account • One person has a given social security number • One definition per word in the dictionary • Could try associating multiple values per key • Map key to Sequence of valuespossible solution • But this means Map’s user must handle complexity
Using a Map • Could try associating multiple values per key • Map key to Sequence of valuespossible solution • But this means Map’s user must handle complexity
Dictionary-based Bartender No problem. I’ll have aManhattan key value
Dictionary-based Bartender Not that Manhattan Sorry. key value
Dictionary-based Bartender Not that Manhattan Sorry. How about… key anothervalue
Dictionary-based Bartender That’ll be $2 billion Mmmmm... Manhattan key not a anothervalue
Dictionary ADT • Dictionary ADT very similar to Map • Hold searchable data in each of these ADTs • Both data structures are collections of Entrys • Convert keyto valueusing either concept • Dictionary can have multiple values for one key • 1 valuefor keyis still legal option
Dictionary ADT • Dictionary ADT very similar to Map • Hold searchable data in each of these ADTs • Both data structures are collections of Entrys • Convert keyto valueusing either concept • Dictionary can have multiple values for one key • 1 valuefor keyis still legal option “awesome”
Dictionary ADT • Dictionary ADT very similar to Map • Hold searchable data in each of these ADTs • Both data structures are collections of Entrys • Convert keyto valueusing either concept • Dictionary can have multiple values for one key • 1 valuefor keyis still legal option “awesome” • Also many Entryswith same keybut different value “cool”“cool”
Map vs. Dictionary MapADT Dictionary ADT • Collection of Entrys • key– searched for • value– cared about • Collection of Entrys • key– searched for • value– cared about
Map vs. Dictionary MapADT Dictionary ADT • Collection of Entrys • key– searched for • value– cared about • Basic implement: • List w/ Entrys in increasing order of keys • Collection of Entrys • key– searched for • value– cared about • Basic implement: • List w/ Entrys in increasing order of keys
Map vs. Dictionary MapADT Dictionary ADT • Collection of Entrys • key– searched for • value– cared about • Basic implement: • List w/ Entrys in increasing order of keys • keyin at most1Entry • Collection of Entrys • key– searched for • value– cared about • Basic implement: • List w/ Entrys in increasing order of keys • Entryscan sharekey
Ordered List-Based Approach • Idea normally imagined w/ Map & Dictionary • Maintains ordered list of key-value pairs • Must maintain Entrys ordered by their key • Faster searching provides performance win Q: “Mom, how do I spell _______?” A: “Look it up.” • Efficiency gains not just for get& getAll • Entrys with same key stored in any order • Only requires that keys be in order only
Ordered List-Based Approach • Iteratorsshouldrespect ordering of Entrys • Should not be a problem, if Entrys stored in order • If O(1) access time, search time is O(log n) • Array-based structure required to hold Entrys • To get immediate access, needs to access by index • Requires IndexList-based implementation
Binary Search • Finds keyusing divide-and-conquer approach • First of many times you will be seeing this approach • Algorithm has problems solved using recursion • Base case 1:No Entrys remain to find the key • Base case 2: At data’s midpoint is matching key • Recursive Step 1: If midpoint too high, use lower half • Recursive Step 2: Use upper half,if midpoint too low
Binary Search • low and high params specifying range to check • Would be called with 0 & size() – 1, initially • If l > h, no match possible in this data • Compare with key at midpoint of low & high • Consider steps for find(7): 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l
Binary Search • low and high params specifying range to check • Would be called with 0 & size() – 1, initially • If l > h, no match possible in this data • Compare with keyat midpoint of low & high • Consider steps for find(7): 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l
Binary Search • low and high params specifying range to check • Would be called with 0 & size() – 1, initially • If l > h, no match possible in this data • Compare with keyat midpoint of low & high • Consider steps for find(7): 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l
Binary Search • low and high params specifying range to check • Would be called with 0 & size() – 1, initially • If l > h, no match possible in this data • Compare with keyat midpoint of low & high • Consider steps for find(7): 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 l = m = h
Using Ordered Sequence • getuses binary search; takes O(log n)time • Should also start with binary search for getAll() • getAllchecks neighbors to find all matches • Add and remove methods could use binary search • List shifts elements in putto make hole for element • Would also need to do shift when removing from list • Each takes O(n) total time in worst case as a result
Comparing Keys • For all searching, must find matching keys • Cannot rely upon equals()when ordering • Want to be lazy, write code for all types of key • Use <, >, == if keys numeric type, but this is limiting • String also has simple method: compareTo() • General way comparing keys based upon this idea?
Comparable<E> Interface • In Java as a standard from java.lang • Defines single method used for comparison • compareTo(E obj)compares instance with obj • Returns intwhich is either negative, zero, positive
Ordered Sequence Example • Easiest to require that keys be Comparable • Now reuse class anywhere by adding interface • Also use standard types like String & Integer • compareTo()in binary search makes it simple int c = k.compareTo(list.get(m).getKey());if (c > 0) {return binarySearch(k, m + 1, h);} else if (c < 0) { return binarySearch(k, l, m - 1);} else { return m;}
What is a Map/Dictionary? • At simplest level, both are collection of Entrys • Focus on transforming data (or so it appears) • Add data with keyand value to which it is transformed • Accessortransforms keytovalueassociated with key • remove() used to delete an Entry • At most one valueper keyusing a Map • With Dictionary, multiple values per keypossible
Before Next Lecture… • Week #4 assignment due Tuesday at 5PM • Continue to do reading in your textbook • Learn more about hash & what it means in CSC • How can we tell if a hash is any good? • Hash tables sound cool, but how do we make them? • Monday is when lab project phase #1 due • Will have time in lab, but then will be the weekend • Project #1 available tonight after lab • Will be due in parts to “encourage” good habits