1 / 45

16-721: Learning-based Methods in Vision

This course introduces the concept of learning-based methods in computer vision and explores their applications in solving complex problems. Topics covered include image datasets, projects and challenges in the field.

cbetsy
Download Presentation

16-721: Learning-based Methods in Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 16-721: Learning-based Methods in Vision • Staff: • Instructor: Alexei (Alyosha) Efros (efros@cs), 4207 NSH • TA: Jean-Francois Lalonde (jlalonde@cs), A521 NSH • Web Page: • http://www.cs.cmu.edu/~efros/courses/LBMV07/

  2. Today • Introduction • Why This Course? • Administrative stuff • Overview of the course • Image Datasets • Projects / Challenges

  3. A bit about me • Alexei (Alyosha) Efros • Relatively new faculty (RI/CSD) • Ph.D 2003, from UC Berkeley (signed by Arnie!) • Research Fellow, University of Oxford, ’03-’04 • Teaching • I am still learning… • The plan is to have fun and learn cool things, both you and me! • Social warning: I don’t see well • Research • Vision, Graphics, Data-driven “stuff”

  4. PhD Thesis on Texture and Action Synthesis Smart Erase button in Microsoft Digital Image Pro: Antonio Criminisi’s son cannot walk but he can fly

  5. Why this class? • The Old Days™: • 1. Graduate Computer Vision • 2. Advanced Machine Perception

  6. Why this class? • The New and Improved Days: • 1. Graduate Computer Vision • 2. Advanced Machine Perception • Physics-based Methods in Vision • Geometry-based Methods in Vision • Learning-based Methods in Vision

  7. The Hip & Trendy Learning Describing Visual Scenes using Transformed Dirichlet Processes. E. Sudderth, A. Torralba, W. Freeman, and A. Willsky. NIPS, Dec. 2005.

  8. Learning as Last Resort

  9. Learning as Last Resort • EXAMPLE: • Recovering 3D geometry from single 2D projection • Infinite number of possible solutions! from [Sinha and Adelson 1993]

  10. Learning-based Methods in Vision • This class is about trying to solve problems that do not have a solution! • Don’t tell your mathematician frineds! • This will be done using Data: • E.g. what happened before is likely to happen again • Google Intelligence (GI): The AI for the post-modern world! • Why is this even useful? • Even a decade ago at ICCV99 Faugeras claimed it wasn’t!

  11. The Vision Story Begins… • “What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking.” • -- David Marr, Vision (1982)

  12. depth map Vision: a split personality • “What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking. In other words, vision is the process of discovering from images what is present in the world, and where it is.” • Answer #1: pixel of brightness 243 at position (124,54) • …and depth .7 meters • Answer #2: looks like bottom edge of whiteboard showing at the top of the image • Which Do we want? • Is the difference just a matter of scale?

  13. Measurement vs. Perception

  14. Brightness: Measurement vs. Perception

  15. Brightness: Measurement vs. Perception Proof!

  16. Lengths: Measurement vs. Perception Müller-Lyer Illusion http://www.michaelbach.de/ot/sze_muelue/index.html

  17. Vision as Measurement Device Real-time stereo on Mars Physics-based Vision Virtualized Reality Structure from Motion

  18. …but why do Learning for Vision? • “What if I don’t care about this wishy-washy human perception stuff? I just want to make my robot go!” • Small Reason: • For measurement, other sensors are often better (in DARPA Grand Challenge, vision was barely used!) • For navigation, you still need to learn! • Big Reason: • The goals of computer vision (what + where) are in terms of what humans care about.

  19. So what do humans care about? slide by Fei Fei, Fergus & Torralba

  20. Verification: is that a bus? slide by Fei Fei, Fergus & Torralba

  21. Detection: are there cars? slide by Fei Fei, Fergus & Torralba

  22. Identification: is that a picture of Mao? slide by Fei Fei, Fergus & Torralba

  23. Object categorization sky building flag face banner wall street lamp bus bus cars slide by Fei Fei, Fergus & Torralba

  24. Scene and context categorization • outdoor • city • traffic • … slide by Fei Fei, Fergus & Torralba

  25. Rough 3D layout, depth ordering

  26. Challenges 1: view point variation Michelangelo 1475-1564

  27. Challenges 2: illumination slide credit: S. Ullman

  28. Challenges 3: occlusion Magritte, 1957

  29. Challenges 4: scale slide by Fei Fei, Fergus & Torralba

  30. Challenges 5: deformation Xu, Beihong 1943

  31. Challenges 6: background clutter Klimt, 1913

  32. Challenges 7: object intra-class variation slide by Fei-Fei, Fergus & Torralba

  33. Challenges 8: local ambiguity slide by Fei-Fei, Fergus & Torralba

  34. Challenges 9: the world behind the image

  35. In this course, we will: Take a few baby steps…

  36. Goals • Read some interesting papers together • Learn something new: both you and me! • Get up to speed on big chunk of vision research • understand 70% of CVPR papers! • Use learninig-based vision in your own work • Try your hand in a large vision project • Learn how to speak • Learn how think critically about papers

  37. Course Organization • Requirements: • Paper Presentations (50%) • Paper Presenter • Paper Evaluator • Class Participation (20%) • Keep annotated bibliography • Ask questions / debate / flight / be involved! • Final Project (30%) • Do something with lots of data (at least 500 images) • Groups of 1 or 2

  38. Paper Advocate • Pick a paper from list • That you like and willing to defend • Sometimes I will make you do two papers, or background • Meet with me before starting, to talk about how to present the paper(s) • Prepare a good, conference-quality presentation (20-45 min, depending on difficulty of material) • Meet with me again 2 days before class to go over the presentation • Office hours at end of each class • Present and defend the paper in front of class

  39. Paper Evaluator • For some papers, we will have Evaluators • Sign up for a paper you find interesting • Get the code online (or implement if easy) • Run it on a toy problem, play with parameters • Run it on a new dataset • Prepare short 10-15 min presentation detailing results • Discuss the paper critically

  40. Class Participation • Keep annotated bibliography of papers you read (always a good idea!). The format is up to you. At least, it needs to have: • Summary of key points • A few Interesting insights, “aha moments”, keen observations, etc. • Weaknesses of approach. Unanswered questions. Areas of further investigation, improvement. • Submit your thoughts for current paper(s) at the end of each class (printout)

  41. Class Participation • Be active in class. Voice your ideas, concerns. • You need to participate • JF will be watching and keeping track!

  42. Final Project • Can grow out of paper presentation, or your own research • But it needs to use large amounts of data! • 1-2 people per project. • Project proposals in a few weeks. • Project presentations at the end of semester. • Results presented as a CVPR-format paper. • Hopefully, a few papers may be submitted to conferences.

  43. End of Semester Awards • We will vote for: • Best Paper Presenter • Best Paper Evaluator • \Best Project • Prize: dinner in a nice restaurant

  44. Course Outline

  45. Datasets • See web page

More Related