480 likes | 567 Views
Last class: summary, goggles, ices. 04/30/13. Discrete Structures (CS 173) Derek Hoiem, University of Illinois. Image: http ://darksideofthecatalogue.wordpress.com/2011/11/22/light-at-the-end-of-the-tunnel-is-glowing-thing-23-12/. Final exam. Tuesday, May 7, 7-10pm
E N D
Last class: summary, goggles, ices 04/30/13 Discrete Structures (CS 173) Derek Hoiem, University of Illinois Image: http://darksideofthecatalogue.wordpress.com/2011/11/22/light-at-the-end-of-the-tunnel-is-glowing-thing-23-12/
Final exam Tuesday, May 7, 7-10pm DCL 1320: Students with last names Afridi to Mehta Siebel 1404: Students with last names Melvin to Zmick You should have already contacted us about conflicts. Note there is no specific conflict exam --- most conflicts should be resolved by other classes unless due to 3 tests in one day. http://admin.illinois.edu/policy/code/article3_part2_3-201.html
What to expect on final • Focus on material after midterm 2, but will be stuff from earlier parts of semester also • Almost certainly at least one question on each of: • Big-O/Algorithms • Proof by Contradiction • State Diagrams • Induction • Countability (not too complicated)
Today’s class • Summary of concepts learned • Fast image retrieval, Google Goggles, and the relevance to discrete structures • ICES forms
What you learned in CS 173 • How to model the world • How to prove things • How to model computational behavior • How to think formally and computationally
How to model the world • Logic, propositions, and relations • Used in natural language processing, machine learning, programming languages • Sets • Used for data mining, groups (e.g., with social networks), image processing (e.g., sets of pixels), clustering • Functions, algorithms • Programming languages, most programming • Graphs and trees • Used in search, machine learning, social networks, path planning, menu design • State diagrams • Used for design of automated systems, AI planning, map building, robotics
How to prove things • Direct proof • Use of cases • Indirect proofs such as by contrapositive • Changing into logically equivalent form sometimes makes the proof easier • Proof by example or counter-example • Useful to show something exists • Induction • Proof for unbounded set of integers • Contradiction • Useful to show something can’t exist
How to model computational behavior • Algorithm analysis • Recursion trees • Unroll recursive functions to analyze cost at root, internal calls, and leaves • Big-O and Big-Theta • Analyze running time independent of implementation details and compute power
How to think formally and computationally • Formal proof methods • Number theory • Important for numerical methods • P vs. NP • Important classes of algorithms • Comparing cardinality of infinite sets • Fundamental implications, such as halting problem
Google goggles http://www.google.com/mobile/goggles/#text Demo
How to quickly find images in a large database that match a given image?
Basic representation: interest points (also called keypoints) Describe appearance of distinctive image patches Thousands of these per image, each is a 128 dimension vector of numbers
Simple idea See how many keypoints are close to keypoints in each other image Lots of Matches Few or No Matches But this will be really, really slow! Like 10 images per second.
110,000,000 Images in 5.8 Seconds Slide Slide Credit: Nister “Scalable Recognition with a Vocabulary Tree”, Nister and Stewenius, CVPR 2006.
Slide Slide Credit: Nister
Slide Slide Credit: Nister
Slide Slide Credit: Nister
Structure 1: “Visual Words” • Group points (descriptors) into sets of similar points (called “clustering”) • Represent image as the number of points you see in each set • Images are similar if they have a lot of sets in common • Concepts from class: a set of 128-dimensional real vectors is partitioned into sets, and new vectors are assigned to a set index:
K-means algorithm 1. Randomly select K centers 2. Assign each point to nearest center 3. Compute new center (mean) for each cluster Illustration: http://en.wikipedia.org/wiki/K-means_clustering
K-means algorithm 1. Randomly select K centers 2. Assign each point to nearest center Back to 2 3. Compute new center (mean) for each cluster Demo: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html
Efficiency from clustering Time to match images in database: • Previous matching time for two images with descriptors of dimensions • Post-cluster matching time: Time to assign points to clusters:
Structure 2: trees for nested partitions • For points within a set to be very similar, need many sets (1,000,000) • Slow time to assign points to sets • Need to compare each point to each cluster center: • Solution: create nested sets • Discrete structures concepts: collections of sets, trees Following slides by David Nister (CVPR 2006)
Much faster processing of query image Old time to assign points to sets: for clusters New time with trees: In practice: 10,000+ times speed up
Structure 3: Inverse document file • Like a book index: keep a list of all the words (keypoints) and all the pages (images) that contain them. • Rank database images based on tf-idf measure. tf-idf: Term Frequency – Inverse Document Frequency # documents # times word appears in document # documents that contain the word # words in document
Speedups • Matching based on set membership • Tree for faster clustering • Inverse document file for only checking images with same sets as query • Overall (in practice 100,000+ times speedup)
110,000,000 Images in 5.8 Seconds Slide Slide Credit: Nister “Scalable Recognition with a Vocabulary Tree”, Nister and Stewenius, CVPR 2006.
Summary • Clever data structures and efficient algorithms make the difference between 10 images per second and 20 million images per second • Clustering (partitioning) for faster comparison • Trees for faster clustering • Lookup table for faster matching • In this class, you learned how to model, analyze, and prove things about discrete structures
Next steps • CS 225: implementing and using data structures such as linked lists, trees, graphs, etc. • CS 241/242: experience writing code and structuring programs and dealing with OS • CS 373: grammars, finite automata, languages, Turing machines, decidability • Research or project experience
ICES forms • Important for course evaluation and feedback • Please provide comments about both positive aspects and ways to improve