190 likes | 337 Views
Mapping Functions and Iterators. Eric Roberts CS 106B May 8, 2009. Where Are We Now?. In our last episode, we implemented the generic Map class based on the idea of a hash table. Our implementation had all of the features of the library version of Map with the exception of
E N D
Mapping Functionsand Iterators Eric Roberts CS 106B May 8, 2009
Where Are We Now? • In our last episode, we implemented the generic Map class based on the idea of a hash table. • Our implementation had all of the features of the library version of Map with the exception of • There was no support for associative array selection • There was no support for iterating through the keys in a map. • The game plan for today is to • Implement associate array selection by defining the []operator • Implement mapping functions • Describe iterator strategies
Interface Entry for the[]Operator /* * Method: operator[] * Usage: map[key] = newValue; * --------------------------- * This method overloads [] to access values from this * map by key. The argument inside the brackets is the key (a * string). This allows the client to use notation of an * "associative-array" to get/set the value associated with a key. * If the key is already present in the map, this function returns * a reference to its associated value. Because this function * returns the value by reference, it allows in-place modification * of the value. */ ValueType & operator[](string key);
Implementation of the[]Operator /* * Implementation notes: operator [] * --------------------------------- * This method looks very much like put except that it doesn't * store a new value and returns the position in the cell by * reference. */ template <typename ValueType> ValueType & Map<ValueType>::operator[](string key) { int index = hash(key) % nBuckets; cellT *cell = findCell(buckets[index], key); if (cell == NULL) { cell = new cellT; cell->key = key; cell->link = buckets[index]; buckets[index] = cell; nEntries++; } return cell->value; }
Functions as Data • Up to this point in CS106, we’ve thought of functions as part of the control structure and as completely separate from the data structure. • That view, however, is limiting and does not reflect what goes on inside the machine. One of the foundational ideas of modern computing—usually attributed to John von Neumann although there are other valid claims to the idea—is that code is stored in the same memory as data. This concept is called the stored programming model. • If you go on to take CS 107, you will learn a little about how code is represented inside the computer. The details of that representation, however, are not important for the moment. What is important is that every C++ function lives somewhere in memory and therefore has an address. It is at least possible therefore to refer to a function in the data structure by storing its address as a pointer value.
Declares x as a double. double x; Declares px as a pointer to a double. double *px; Declares f as a function returning a double. double f(); Declares g as a function returning a pointer to a double. double *g(); Declares proc as a pointer to a procedure returning a double. double (*proc)(); Declares fn as a pointer to a function taking and returning a double. double (*fn)(double); Function Pointers in C++ • In keeping with its predecessor language, C++ makes it possible for programmers to use function pointers explicitly. • The syntax for declaring function pointers is consistent with the syntax for other pointer declarations, although it takes some getting used to. Consider the following declarations:
For example, if you were to call Plot(cos, -2 * PI, 2 * PI, -1.0, 1.0); you would expect to see something like this: Exercise: Plotting a Function Write a function Plot that takes a function (doubledouble) as a parameter along with the limits of the domain and range and then plots that function on the graphics window.
1. 2. What is the prototype for the Plot function? How would you convert values x and y in the mathematical domain into screen points sx and sy? void Plot(double (*fn)(double), double minX, double maxX, void Plot(double (*fn)(double), double minY, double maxY); double width = GetWindowWidth(); double height = GetWindowHeight(); double sx = (x - minX) / (maxX - minX) * width; double sy = height - (y - minY) / (maxY - minY) * height; Exercise: Two Related Questions Hint: In the time that x moves from minX to maxX, sx must move from 0 to GetWindowWidth(); y must move in the opposite direction from GetWindowHeight() to 0.
When I talked about overflow problems a few weeks ago in class, Katie Dektar pointed me to the following XKCD comic that had appeared earlier that week: CAN’T SLEEP —Randall Monroe, XKCD, April 2009 The Overflow Problem • The graph of the exponential function and the bug in the code that it revealed provide a good illustration of just how quickly exponential time algorithms degrade as N increases.
To support that strategy, the Map class could export a method void mapAll(void (*fn)(string)); that calls fn on each key in the map. • For example, you could then print out every key in myMap by calling myMap.mapAll(PrintKey) where PrintKey is void PrintKey(string key) { cout << key << endl; } Mapping Functions • The ability to work with pointers to functions offers one solution to the problem of iterating through the elements of a map. All you need to do is specify a function that can be applied to a given element and then have the implementation of the Map class apply that function to each element in turn.
Exercise: ImplementmapAll Implement the simple version of void mapAll(void (*fn)(string)); as part of the Map class.
OR LA ME MS WA OH VT DE KY WY MT MN KS NC NY ND ID CO WI NJ NV NH VA MI NM UT WV OK AR IA TN RI SC TX GA AZ CT MD PA Vermont Virginia Tennessee New York South Carolina Ohio Oregon Minnesota New Mexico West Virginia Wyoming Michigan Iowa Arkansas Mississippi North Carolina Oklahoma Utah North Dakota Arizona Texas Rhode Island Georgia Washington Kansas Montana Connecticut Delaware Pennsylvania New Jersey Colorado Maryland Idaho Louisiana Kentucky Maine New Hampshire Nevada Wisconsin CA FL IN AK IL AL Indiana Alabama Alaska California Florida Illinois null null null null null null The Bucket Hash Structure 0 1 2 3 4 5 6 SD NE MO MA HI South Dakota Nebraska Missouri Massachusetts Hawaii null
You can get closer by having the mapping function take both a key and a value, where the type of the value is determined by the template parameter of the map. The version of mapAll exported by the library looks like this: void mapAll(void (*fn)(string, ValueType)); • Given this definition, you could change PrintKey to void PrintKey(string key, int value) { cout << key << "=" << value << endl; } Passing Values and Keys Together • Suppose, however, that myMap is a Map<int> and that you want to print the keys and the values together. You could not achieve that goal with the current definition of mapAll because there is no way for the PrintKey function to get access to the values or even to the map itself.
The template facility makes it easier to pass data to callback functions than it would be otherwise. In addition to the mapAll method shown on the preceding slide, the Map class also exports the following template method: template <typename ClientDataType> void mapAll(void (*fn)(string, ValueType, ClientDataType &), ClientDataType & data); Passing Data to Mapping Functions • Even this change, however, is not sufficient to write anything at all complicated using the mapping function approach. In almost all cases, you need to pass additional information to the mapping function. That data must pass from the client, through the implementation, and back into the function the client supplied. For this reason, these functions are often referred to as callback functions.
Exercise: Find Longest State Name Suppose that you have a Map<string> named stateNames that maps two-letter abbreviations into state name pairs. Write a function that returns the longest state name in the map.
Iterators • Hardly anyone today uses mapping functions in practice because they have been superseded by iterators, which are far more convenient to use. • Unfortunately, the strategy used to implement iterators in the CS106 libraries would be extremely difficult to explain to students at this level. Over the summer, I’ll rewrite the iterator code so that it makes sense to explain it at the 106B level. For now, the best strategy is to talk about how one might implement them. • The basic strategy behind any iterator is the same as the one used in the implementation of a mapping function. Instead of going through the entire set of values all at once, however, an iterator must maintain enough internal state to keep track of how to return the next value.
Offline vs. Online Iterators • One simple way to implement an iterator is to adopt the following strategy: • Create an empty vector of the same element type. • Use a mapping function to store each element in the vector. • Store the vector and current index in the iterator object. • Implement next() by returning the current element and then advancing the index. • Implement hasNext() by checking if the index is past the end of the vector. Such an iterator is called an offline iterator. • Offline iterators tend to be easy to write, but the fact that they have to precompute the entire list of elements makes them so inefficient that no one really uses them. • The online iterator model keeps enough state information in the iterator so that precomputation is not required.
Checking for Modifications • The one area in which offline iterators have some value over their more common online counterparts is that precomputing the element list means that the client is free to change the structure of the collection class while the iterator is running. By contrast, adding or removing elements from a collection during iteration is likely to cause errors when working with using online iterators. • Modern iterator packages protect against this kind of error by checking whether the structure has been modified whenever next or hasNext is called. If so, the iterator can generate an appropriate error message instead of failing unpredictably. • The easiest way to check for modification is to include a timestamp in both the structure and the iterator. Changing the structure increments the timestamp, so the iterator code can check whether it has changed.