130 likes | 276 Views
Algorithmically Adversarial Input Design. Brian C. Dean, Chad Waters School of Computing, Clemson University. “Making Mathematical Reasoning Fun” Workshop ACM SIGCSE, 2013. Intro / Motivation.
E N D
Algorithmically Adversarial Input Design Brian C. Dean, Chad Waters School of Computing, Clemson University “Making Mathematical Reasoning Fun” Workshop ACM SIGCSE, 2013
Intro / Motivation • I teach several algorithms and computational problem-solving classes at the undergraduate through graduate levels. • I also direct the USA Computing Olympiad, which provides algorithmic problem-solving tutorials and competitions to thousands of top high-school CS students worldwide. • In both cases, a primary challenge is making algorithmic problem-solving fun.
Intro / Motivation • I teach several algorithms and computational problem-solving classes at the undergraduate through graduate levels. • I also direct the USA Computing Olympiad, which provides algorithmic problem-solving tutorials and competitions to thousands of top high-school CS students worldwide. • In both cases, a primary challenge is making algorithmic problem-solving funhelping students realize the intrinsic fun-ness of algorithmic problem-solving.
Algorithmic Problem Solving Fun • Articulate problem-solving concepts in a more concrete, tangible medium: Games, robots, cell phones, unplugged, multimedia • Teach problem-solving concepts that let students re-create cutting-edge computing technology they know and appreciate: Recommendation systems, data mining, predictive text completion, web search w/Google pagerank, handheld GPS based automobile navigation • Team exercises; collaboration / competition.
It’s More Fun to be the Bad Guy… • In security / software engineering classes, students often study vulnerabilities in software or security mechanisms from a “bad guy” perspective. • This adversarial perspective is much less common in algorithmic classes though. • However, it made for a very successful homework exercise in my undergraduate algorithms / data structures course…
The Exercise • I give students a bit of code that has some sort of algorithmic weakness. • Students need to examine this code and then submit a program that generates a bad input for my program. • Success is defined by: - Student program runs fast. - Student program generates an input that makes my program run slow.
Example: Simplistic Hash Table Hash(x) = (3x + 17) % table_size • By reverse-engineering the mathematics of the hash function my program uses, students can provide a set of input elements that all hash to the same entry of the hash table! • This makes a program that should have run in O(N) time take O(N2) time instead.
Example: Randomized Quicksort with Weak Random # Generator • To sort an array A[0…N-1]: • Choose “random” index i = 123456789 % N; • Partition array on value of A[i]: • Recursively sort left and right sides. • We can make this run slowly (O(N2) versus O(N log N)) by making sure the A[i] is always the minimum / maximum in the entire array… A[0…N-1] elements < A[i] A[i] elements > A[i]
Constructing an Adversarial Input… • Ensure the max is in position i = 123456789 % N. • When partitioning happens, the max gets pulled to the end of the array, leaving the other 999,999 elements in the same order as before. A[0…N-1] 999,999 elements < A[i] A[i] = max And this should now be an adversarial input for size N = 999,999!
Constructing an Adversarial Input… • We now have the insight to construct an adversarial input of size N by working backwards. • Starting from a bad input of size N – 1, insert a new maximum element at position i = 123456789 % N. • This generates a bad input of size N in O(N2) time. A[0…N-1] 999,999 elements < A[i] A[i] = max And this should now be an adversarial input for size N = 999,999!
Constructing an Adversarial Input… • We now have the insight to construct an adversarial input of size N by working backwards. • Starting from a bad input of size N – 1, insert a new maximum element at position i = 123456789 % N. • This generates a bad input of size N in O(N2) time. • However, using augmented balanced binary search trees, one can implement the algorithm above in only O(N log N) time, so it runs much faster than the weak quicksort algorithm provided by the instructor…
Automated Grading • An assignment of this sort is ideal from an instructor’s perspective since it can be automatically graded. • Score is based on: • How fast student’s program runs (faster is better). • How slow the instructor’s program runs on the input generated by the student program (slower is better). • For example, one could give full credit if the student program runs in < 5 seconds and causes the instructor program to run in > 5 seconds, for a large input size.
Thanks! Questions / Discussion?