1 / 17

CS 764 Seminar in Computer Vision

CS 764 Seminar in Computer Vision. Ramin Zabih. Fall 1998. Course mechanics. Meeting time will be Tue/Thu 11-12, here Starting a week from today Home page is now up www/CS764 Assignment: present one paper You’ll have a lot of freedom, but you need to talk to me in advance

april-casey
Download Presentation

CS 764 Seminar in Computer Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 764Seminar in Computer Vision Ramin Zabih Fall 1998

  2. Course mechanics • Meeting time will be Tue/Thu 11-12, here • Starting a week from today • Home page is now up www/CS764 • Assignment: present one paper • You’ll have a lot of freedom, but you need to talk to me in advance • Some possible papers will be posted shortly

  3. Topic of this seminar • The use of “knowledge” in the analysis of visual data • Sometimes called “context” • Clearly this is vital • On both psychological and technical grounds • But how? No one has much of an idea… • What is the interface between reasoning and perception? (Or, mind and body?)

  4. What is the visual system’s “contract” • Two standard (bad) answers • Answer 1: describe the scene in terms of surfaces [low-level vision] • There is a green patch 2” wide 1’ away • Answer 2: describe the scene in terms of objects [model-based recognition] • Start with a set of 3D models (modelbase) • Determine position and pose

  5. Why are these answers wrong? • They are almost purely data-driven • Bottom-up (from the data) versus top-down (from somewhere else) • They report “objective fact”, with no room for the task at hand • For a given image, there is only one right answer • Other problems as well • Not very useful, etc.

  6. Technical and psychological arguments • There are technical arguments against this • Vision is an inverse problem • Many 3D scenes could explain a single 2D image • On engineering grounds, this makes no sense • Ultimately, perception is used for some task • The human perceptual system has both top-down and bottom-up elements • Various optical illusions • Two people can look at the same picture and see something completely different

  7. Your vision system doesn’t listen

  8. It makes “reasonable” assumptions

  9. Low-level vision has its solution • Inverse problems require assumptions • The assumptions for low-level vision are extremely general (I.e., weak) • Reflect the physics of the visible world • For example, motion or depth or intensity tend to be “coherent” • Saying that every pixel is moving differently from its neighbors is a very unlikely answer • The world we live in tends not to do that • Helmholtz’s “unconscious inference”

  10. We’ll need high-level vision • Most of the field is low-level vision or model-based recognition • Partly to avoid the confusion CS764 is about • Key question: how to avoid brittleness? • Can make the visual system compute just what we need for our task (I.e., berries) • But how to handle the unexpected (I.e., lions)?

  11. A short historical perspective • 1960’s vision was completely task-specific • A black blob in the center of the image is a telephone • These efforts are now considered “hacks” • 1970’s vision became completely general • Marr pushed the field towards precise technical questions • Low-level vision and recognition became dominant

  12. Tasks strike back • In the mid-1980’s, several attempts were made to re-introduce a notion of task • Active/animate/purposive vision • These attempts are widely viewed as failures, for good reasons • We’ll look at them a bit next week • It’s not enough to have good intuitions • There needs to be technical merit as well

  13. Desiderata • Technical solutions (algorithms) that are very roughly consistent with human data • Goal is not AI, psychology or philosophy • Provide visual summaries useful for tasks, but degrade gracefully • Handle open/unstructured environments • Deal with expectations and breakdown

  14. Our path for 764 • No good computational work to read • Perhaps Vera will fix this? • We will examine papers along these lines: • Computational approaches that failed • Psychological data that is highly suggestive • Neurologically inspired architectures • Cognitive scientists and philosophers • Their goal is argument, not algorithm! • They’ve thought the most about these issues

More Related