320 likes | 441 Views
COM1721: Freshman Honors Seminar. A Random Walk Through Computing Rajmohan Rajaraman Tuesdays, 5:20 PM, 149 CN. Introduction. Explore a potpourri of concepts in computing. Theory, examples, and applications Readings: Handouts and WWW Grading: Quizzes, homework, and class participation.
E N D
COM1721: Freshman Honors Seminar A Random Walk Through Computing Rajmohan Rajaraman Tuesdays, 5:20 PM, 149 CN
Introduction • Explore a potpourri of concepts in computing • Theory, examples, and applications • Readings: Handouts and WWW • Grading: Quizzes, homework, and class participation 1: a mixture of flowers, herbs, and spices that is usually kept in a jar and used for scent 2: a miscellaneous collection Etymology: French pot pourri, literally rotten pot
Sample Concepts • Abstraction • Modularity • Randomization • Recursion • Representation • Self-reference • …
Sample Topics • Dictionary search • Structure of the Web • Self-reproducing programs • Undecidability • Private communication • Relational databases • Quantum computing, bioinformatics,…
Abstraction • A view of a problem that extracts the essential information relevant to a particular purpose and ignores inessential details • Driving a car: • We are provided a particular abstraction of the car in which we only need to know certain controls • Building a house: • Different levels of abstraction for house owner, architect, construction manager, real estate agent • Related concepts: information hiding, encapsulation, representation
Modularity • Decomposition of a system into components, each of which can be implemented independent of the others • Foundation for good software engineering • Design of a basic processor from scratch
Representation • To portray things or relationship between things • Knowledge representation: model relationship among objects as an edge-labeled graph • Data representation: bar graphs, histograms for statistics • Querying a dictionary; Web as a graph
Randomization • An algorithmic technique that uses probabilistic (rather than deterministic) selection • A simple and powerful tool to provide efficient solutions for many complex problems • Has a number of applications in security • Cryptography and private communication
Recursion • A way of specifying a process by means of itself • Complicated instances are defined in terms of simpler instances, which are given explicitly • Closely tied to mathematical induction • Fibonacci numbers
Self-reference • A statement/program that refers to itself • Examples: • “This statement contains five words” • “This statement contains six words” • “This statement is not self-referential” • “This statement is false” • Important concept in computing theory • Undecidability of the halting problem, self-reproducing programs • Gödel Escher Bach: an Eternal Golden Braid, Douglas Hofstader
Illustration: Representation • Problem: Derive an expression for the sum of the first n natural numbers • 1 + 2 + 3 + … + n-2 + n-1 + n = ?
Sum of First n Natural Numbers 1 + 2 + 3 + … + 98 + 99 + 100 = S 100 + 99 + 98 + … + 3 + 2 + 1 = S 101 + 101 + 101 + … + 101 + 101 = 2S S = 100*101/2 S = n(n+1)/2
Other Equalities • Sum of first n odd numbers • 1 + 3 + 5 + … + 2n-1 = ? • Sum of first n cubes • 1 + 4 + 9 + 16 + … + n^3 = ?
Representation and Programming • Representation is the essence of programming Brooks, “The Mythical Man-Month” • Data structures
Dictionary • A collection of words with a specified ordering • Dictionary of English words • Dictionary of IP addresses • Dictionary of NU student names
Searching a Dictionary • Suppose we have a dictionary of 100,000 words • Consider different operations • Search for a word • List all anagrams of a word • Find the word matching the largest prefix • What representation (data structure) should we choose?
Search for a Word • Store the words in sorted order in a linear array • Unsuccessful search: • compare with 100,000 words • Successful search: • on average, compare with 50,000 words
Twenty Questions • Compare with 50,000th word • If match, then done • If further in dictionary order, search right half • If earlier in dictionary order, search left half • Until word found, or search space empty • Recursion • Binary search
How Many Questions? vangogh
How Many Questions? Question #Search space 0 100,000 1 50,000 2 25,000 3 12,500 5 3,125 10 100 15 4 17 1
Anagrams • An anagram of a word is another word with the same distribution of letters, placed in a different order • Input: deposit Output: posited, topside, dopiest • Anagrams: subessential suitableness
Detecting Anagrams • How do you determine whether two words X and Y are anagrams? • Compare the letter distributions • Time proportional to number of letters in each word • Suppose this subroutine anagram(X,Y) is fast
Listing Anagrams of a Word • Dictionary of 100,000 English words • List all anagrams of least • How should we represent the dictionary? • Linear array • Loop through dictionary: if anagram(X,least), include X in list • Running time = 100,000 calls to anagram()
A Different Data Structure • If X and Y are anagrams of each other, they are equivalent; the list of anagrams of X is same as the list for Y • This indicates an equivalence class of anagrams! deposit posited topside dopiest race care acre adroitly dilatory idolatry
Anagram Signatures • Would like to store anagrams in the same class together • How do we identify a class? • Assign a signature! • Sort all the letters in the anagram word(s) • Same for each word in a class! acre race care: acer deposit posited topside dopiest: deiopst subessential suitableness: abeeilnssstu
Anagram Program sign sort
Anagram Program merge
Listing Anagrams for Given Word X • Compute sign(X) and lookup sign(X) in dictionary using binary search • List all words in list adjacent to sign(X) post sign opst lookup
Efficiency of Anagram Program • Once dictionary has been stored in new representation: • Lookup takes at most 17 queries • Listing time is proportional to number of anagrams in the class • What about the cost of new representation? • Sign each word, sort, and merge • Expensive, but need to do it only once! • Preprocessing
References • Programming Pearls, by Jon Bentley, Addison-Wesley • Great Ideas in Theoretical Computer Science, Steven Rudich A course at CMU