480 likes | 494 Views
Visual Scene Understanding (CS 598). Derek Hoiem. Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday 11:00am – 12:15pm Office Hours: Tuesday and Thursday 12:15-1pm; by appointment Contact: dhoiem@uiuc.edu, Siebel 3312. Today.
E N D
Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday 11:00am – 12:15pm Office Hours: Tuesday and Thursday 12:15-1pm; by appointment Contact: dhoiem@uiuc.edu, Siebel 3312
Today • Introductions • Overview of logistics • Overview of class material
Vision: What is it good for? Biological (Humans) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Technological (Computers) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Note: Unfortunately, these got erased when my computer crashed
Class Content Overview • Tutorials and Perspectives • Paper reading • Spatial Inference • Objects • Actions • Context and Integration
Visual Scene Understanding Visual scene understanding is the ability to infer general principles and current situations from imagery in a way that helps achieve goals.
Visual Scene Understanding Visual scene understanding is the ability to infer general principles and current situations from imagery in a way that helps achieve goals.
Visual Scene Understanding Visual scene understanding is the ability to infer general principles and current situations from imagery in a way that helps achieve goals.
Visual Scene Understanding Visual scene understanding is the ability to infer general principles and current situations from imagery in a way that helps achieve goals.
Spatial Inference: applications Automated Vehicles Household Robots Graphics Applications Predict object size/position
Spatial Inference: open questions • How do we represent space? • Surface orientations, depth maps, voxels? • How do we infer it from available sensory data (image, stereo, motion, laser range finder)?
Finding Things and Observing Them Image classification: Are there any dogs? Photo credit: iansand – flickr.com
Finding Things and Observing Them Object Localization: Where are the dog(s)?
Finding Things and Observing Them Verification: Is this a dog?
Finding Things and Observing Them Description: Furry, small, nice, side view
Finding Things and Observing Them Identification: My friend Sally?
Recognizing Stuff SKY WATER SAND
Object Recognition: applications Photo Search Security Robots
Object Recognition: open questions • How many examples does it take to learn one category well? • How many examples does it take to learn 100 categories well? • How do these answers depend on the level of supervision? • Can recognition be solved with simple methods and massive amounts of data? • How can we quickly recognize an object? • How can we scale up to deal with thousands of categories?
Taking Action [Saxena et al. 2008]
Recognizing Actions KTH Dataset Figure from Laptev et al. 2008
Recognizing Actions Figure from Laptev et al. 2008
Reading Emotions Photo credit: Comstok
Actions: applications Video Search Security
Actions: open questions • How are actions defined? • Does it make sense to categorize them? • If not, how do we recognize them? • What are good visual representations for inferring actions? • How can we recognize activities?
IV. Context and Integration [Hoiem et al. 2008]
Context and Integration • Objects + scene categories better detection • Movement + objects action/activity recognition • Space + objects navigation [Hoiem et al. 2008]
Context and Integration: applications Everything that vision is good for
Context and Integration: open questions • Should context be explicit (e.g., “cars drive on the road”) or implicit (feature-based)? • How do we model and learn the interactions between different processes and scene characteristics? • How do we deal with the growing complexity as more and more pieces are put together?
General Problems in Computer Vision • Better understanding of limitations and their sources • Need new experimental paradigms • Improve generalization • Aim to generalize across datasets, categories, and tasks • Work on knowledge sharing and transfer • Vision as a way of learning about the world • Integration into AI • Systems that acquire knowledge over time
Successes of Computer Vision • Point matching (e.g. 2d3) • Tracking • Structure from motion • Stitching • Product inspection • Multiview 3d reconstruction • Face recognition and modeling • Object recognition on pre-2000 datasets • Interactive segmentation (ongoing)
To Do • Register on bulletin board • Post comments on Thursdays reading (due tomorrow) • Look over schedule and decide which days to present (due next Tues) • Start thinking about projects • Let me know if you want a specific pairing (due Tues)
Goals • Make you a better researcher (esp. in vision) • More knowledge • Better critical thinking skills • Improved communication skills • Improved research skills
Grades • Participation: 25% • Posting • Class discussion • Presentation: 25% • Projects: 50% • Proposal, progress report, final paper, and oral
Policies • Attendance required (see syllabus) • Give credit where due • No formal prerequisites • Everything needs to be on time
Reading • Read well • Post comments to bulletin board at least 24 hours before class
Presentations • Presenter • Everyone does two • Good quality coverage of topic (40 min) • See syllabus for guidelines • Sign up by next Tuesday (at latest) • TBAs are your choice (decide at least 4 weeks in advance) • Demonstrator • If all days are taken, pair up • One person’s job will be to demonstrate some aspect of the algorithm (e.g., where it succeeds and fails) by running it on many examples • May require implementation • Note taker
Projects • Timeline • Proposal: Feb 12 (3 ½ weeks!) • Progress report: Mar 19 • Presentation: paper May 5, oral later • Progress report • Presentation • Paper • Oral • In pairs • Can choose partner or be randomly paired • Suggestions on web • Potentially will lead to publication (e.g. NIPS)
To Do • Register on bulletin board • Post comments on Thursdays reading (due tomorrow) • Look over schedule and decide which days to present (due next Tues) • Start thinking about projects • Let me know if you want a specific pairing (due Tues)