maps & dictionaries

maps & dictionaries Go&Ta 9.3

map • A map models a searchable collection of key-value entries • The main operations of a map are for searching, inserting, and deleting items • Multiple entries with the same key are not allowed • Applications: • address book (key=name, value = address) • student-record database (key=student id, value = student record)

map The map ADT requires that each key is unique, so the association of keys to values defines a mapping 3

map 4

Operation Output Map isEmpty() true Ø put(5,A) null (5,A) put(7,B) null (5,A),(7,B) put(2,C) null (5,A),(7,B),(2,C) put(8,D) null (5,A),(7,B),(2,C),(8,D) put(2,E) C (5,A),(7,B),(2,E),(8,D) get(7) B (5,A),(7,B),(2,E),(8,D) get(4) null (5,A),(7,B),(2,E),(8,D) get(2) E (5,A),(7,B),(2,E),(8,D) size() 4 (5,A),(7,B),(2,E),(8,D) remove(5) A (7,B),(2,E),(8,D) remove(2) E (7,B),(8,D) get(2) null (7,B),(8,D) isEmpty() false (7,B),(8,D) example map Maps 8

We can implement a map using an unsorted list We store the items of the map in a list S (based on a doubly-linked list), in arbitrary order map A Simple List-Based Map head tail a r 5 8 c g 9 6 entries

Performance: put, get and remove take O(n) time since in the worst case (the item is not found) we traverse the entire sequence to look for an item with the given key The unsorted list implementation is effective only for maps of small size. All of the fundamental operations take O(n) time. Would like something faster… map Performance of a List-Based Map

An example of using a map HashMap

This was some experiments with Rebecca. We have a number of patches (centres of population) that may be infected by some disease. There’s a probability that disease can travel from one location to another (a function of distance and size of patch). Therefore a state is a set of infected patches. if we have n patches there are 2^n states. I can represent a state as a BitSet (n bits) and a bit is on iff that patch is infected. So, I took on the task of building a harness to check out the feasibility of capturing every state visited by a simulation and how often we visited each of these states. Rebecca was trying to identify quasi-stationary states. This empirical study would then be compared to an analytical model to measure the difference between theory and practice So, I thought I’d use a map, where entries were states, each state with a counter of the number of times visited. I could use the BitSet as the key.

This was some experiments with Rebecca. We have a number of patches (centres of population) that may be infected by some disease. There’s a probability that disease can travel from one location to another (a function of distance and size of patch). Therefore a state is a set of infected patches. if we have n patches there are 2^n states. I can represent a state as a BitSet (n bits) and a bit is on iff that patch is infected. So, I took on the task of building a harness to check out the feasibility of capturing every state visited by a simulation and how often we visited each of these states. Rebecca was trying to identify quasi-stationary states. This empirical study would then be compared to an analytical model to measure the difference between theory and practice So, I thought I’d use a map, where entries were states, each state with a counter of the number of times visited. I could use the BitSet as the key. I forgot to say, Simon was doing the theory bit …

This was some experiments with Rebecca. We have a number of patches (centres of population) that may be infected by some disease. There’s a probability that disease can travel from one location to another (a function of distance and size of patch). Therefore a state is a set of infected patches. if we have n patches there are 2^n states. I can represent a state as a BitSet (n bits) and a bit is on iff that patch is infected. So, I took on the task of building a harness to check out the feasibility of capturing every state visited by a simulation and how often we visited each of these states. Rebecca was trying to identify quasi-stationary states. This empirical study would then be compared to an analytical model to measure the difference between theory and practice So, I thought I’d use a map, where entries were states, each state with a counter of the number of times visited. I could use the BitSet as the key. I forgot to say, Simon was doing the theory bit … So, here’s a state

… and I can create a state from a BitSet or an integer

… and when I revisit increment it’s counter

… convert a BitSet to an integer … convert an integer to a BitSet … classic stuff  (show off?)

So, now I want to pretend I have n possible patches and I’m going to run my simulation for m iterations and capture all states with frequency of occurrence, and I wanted it to be quick, and I wanted it to be compact, and I wanted it to be simple … and I used a HashMap

… and this is me generating m states and testing to see if I have visitd them before

… and then I print them out

… but Rebecca wanted to do a statistical test, where we rank states … The results aren’t in a structure that’s quite right.

… but Rebecca wanted to do a statistical test, where we rank states … The results aren’t in a structure that’s quite right. Could you sort states?

… now keys have to be comparable, And that’s why we have toInt and toBitSet 

… and now it’s in order 

… and that raises another question.

How do they implement a TreeMap?

… and then I wonder …

Am I going round in circles?

dictionaries

a reality check

dictionaries The dictionary abstract data type stores key-element pairs (k,v), which we call entries, where k is the key and v is the value. A dictionary allows for multiple entries with the same key, much like an English dictionary, where we can have multiple definitions of the same word. The primary use of a dictionary is to store values so that they can be located quickly using keys.

dictionaries The dictionary abstract data type stores key-element pairs (k,v), which we call entries, where k is the key and v is the value. A dictionary allows for multiple entries with the same key, much like an English dictionary, where we can have multiple definitions of the same word. The primary use of a dictionary is to store values so that they can be located quickly using keys. We might have an orderedor unordereddictionary

api dictionaries Given a dictionary D • size() • isEmpty() • find(k) k is a key, returns matching entry or null • insert(k,v) k is a key with value v, inserts into D and returns entry • remove(k) remove from D entry with key k • findAll(k) returns iterable collection of entries with key k Note: find(k) finds an arbitrary entry (k,v)

Operation Output Dictionary isEmpty() true Ø insert(5,A) (5,A)(5,A) insert(7,B) (7,B)(5,A),(7,B) insert(2,C) (2,C)(5,A),(7,B),(2,C) insert(5,D) (5,D)(5,A),(7,B),(2,C),(5,D) insert(2,C) (2,C)(5,A),(7,B),(2,C),(5,D),(2,C) find(7) (7,B)(5,A),(7,B),(2,C),(5,D),(2,C) find(4) null (5,A),(7,B),(2,C),(5,D),(2,C) find(2) (2,C)(5,A),(7,B),(2,C),(5,D),(2,C) size() 5 (5,A),(7,B),(2,C),(5,D),(2,C) remove(5) (5,A)(7,B),(2,C),(5,D),(2,C) remove(2) (2,C)(7,B),(5,D),(2,C) size() 3(7,B),(5,D),(2,C) isEmpty() false (7,B),(5,D),(2,C) example dictionary

dictionaries unordered

unordered list dictionaries • size() • isEmpty() • find(k) • insert(k,v) • remove(k) • findAll(k) size() is O(1) … maintain a counter isEmpty() is O(1) … test on size find(k) is O(n) … linear search insert(k,v) is O(1) … addLast or addFirst remove(k) is O(n) … we need to find entry findAll(k) is O(n) … search the entire list!!! Summary: fast insertion, slow find, findAll, & remove BUT: if we want to use “move to front” heuristic … maybe good

dictionaries hash table

hash table dictionaries • size() • isEmpty() • find(k) • insert(k,v) • remove(k) • findAll(k) Need to deal with duplicates … i.e. natural collisions!

dictionaries ordered array

ordered array dictionaries • size() • isEmpty() • find(k) • insert(k,v) • remove(k) • findAll(k) size() is O(1) … maintain a counter isEmpty() is O(1) … test on size find(k) is O(log(n)) … binary search insert(k,v) is O(n) … binary search then search left and right!!! remove(k) is O(n) … as above findAll(k) is O(n) … mix of binary search then left & right linear search

ordered array dictionaries In some way it’s a bag! Could we “count” how many times something is in the bag? Could we then use a binary search tree with nodes having multiple entries?

chocolate

No. You don’t deserve it

maps & dictionaries