430 likes | 642 Views
4054 Machine Vision. Dr. Simon Prince Dept. Computer Science University College London. http://www.cs.ucl.ac/uk/s.prince/4054.htm. Course Mailing List. IMPORTANT!
E N D
4054 Machine Vision Dr. Simon Prince Dept. Computer Science University College London http://www.cs.ucl.ac/uk/s.prince/4054.htm
Course Mailing List IMPORTANT! You must sign up to the course mailing list. Regardless of the actual module code that you are taking this under, please sign up by sending a mail to 4054-request@cs.ucl.ac.uk with the word “join” in the subject line. You should receive a confirmation that you have joined. If you do not receive this then please contact the helpdesk. Course announcements will be on this mailing list. All announcements to this list will be assumed to have been received.
Schedule LECTURES Tuesday: 12-1pm MPEB 1.13 Thursday: 10-12pm Wilkins JBR Meeting Room Thursday: 2-3pm South Wing Committee Room Thursday: 4-5pm MPEB 113 Friday: 11-12: Roberts 110 PRACTICAL SESSIONS Thursday 3-4pm MPEB 4.17 Friday 3-5pm MPEB 1.05 Demonstrator: Alastair Moore Information via: http://www.cs.ucl.ac.uk/staff/A.Moore/teaching.htm
Materials • Lecture Notes • Slides • Key Papers • Practicals
Time Commitment 30 one-hour lectures 15 hours problem classes – do the programming practicals in these. Expect to have to spend ten hours a week on • Reading papers, reviewing class notes • Working through proofs and implementing algorithms
Books Essentially, there are no good books, but these are the best of a bad lot
Books Good for tracking (free online) Good for geometry (1 free chapter online)
Books Good for geometry (1 free chapter online)
Assumed Knowledge • Linear Algebra up to and including the Singular Value Decomposition • Probability and random variables (course 3006, ongoing) • Familiarity with the multivariate normal distribution
Problems "In mathematics you don't understand things. You just get used to them." John Von Neumann, 1903-1957 How to get used to them? Do problems from the appropriate book. Implement the algorithms in the lecture notes. Work through the proofs in the text and appendices. Familiarity and understanding of mathematics only comes with use.
Exam • 2.5 Hours • Choose 3 questions from 5 • No restrictions • The exam is in February.
Coursework To complete COMP4054 Machine Vision, you must complete two assignments. One comprises a Matlab implementation and is taken from the first half of the course. The second comprises a critical literature review and is taken from the second half of the course. In both cases, you have a choice of topic. Practical 1: Programming Assignment Deadline: 21/12/07 Practical 2: Literature Review Deadline: 8/1/08 No extensions except in the most serious circumstances
Practical #1 • In practical 1 you are required to complete one of five short Matlab assignments – you can choose from these assignments. Although you are only required to submit one assignment, you are strongly recommended to complete all of these projects nonetheless. They are designed to help you understand aspects of the course and will help with your revision. Supporting materials are available from the main COMP4054 website. • Topics: • Geometry of a single camera • Geometry of multiple cameras • Dense stereo vision • Background subtraction • Face detection
Practical #1 • Regardless which of the five projects you choose, the format of the report will be the same. It should consist of 2-5 pages containing: • a short literature review of the project area • a description of the techniques in succinct mathematical terms • a description of what was done • the results obtained • relevant figures to explain your method and results • an analysis and critique of the results • suggestions for further improvements to the method • code (suitably commented) should be included in an appendix
Practical #2 • Write a 2000 word literature review on one of the following topics: • Object class recognition • Face Detection (i.e. finding faces in images) • Facial Identity Recognition • Object Tracking • Representations of Shape in Computer Vision • Your literature review should include: • An overview of the history of the topic • An in-depth description of 3-4 critical papers of your choosing • A discussion of other papers and relation to these critical papers • A discussion of how success is quantified in this area, and a description of what you think the state of the art is • A discussion of the ways in which current methods are deficient • Suggestions for likely directions of future work in this area
Website, Mail, Office Hours Website: http://www.cs.ucl.ac.uk/staff/s.prince/4054.htm Mail: s.prince@cs.ucl.ac.uk With 4054 in the subject line Problems: I will be available on Monday afternoons, 3-5pm if you have serious problems. Room 5.06
What is computer vision? Computer vision is concerned with developing artificial system which extract information from image or video data. Computer vision is an expanding academic field with increasing attendance at major conferences. There is also considerable interest from industry with many major players (Intel, Mitsubishi, Microsoft etc.) establishing computer vision laboratories
Computer Vision Tasks Tasks for computer vision might include: Reconstruction- the attempt to build a three dimensional model of the scene from one or more images Camera Tracking - identifying the movement of the camera relative to the scene in a fixed image sequences Object Detection - establishing that a certain type of object (e.g. a dog) is in the scene Segmentation- establishing exactly which pixels belong to a certain object Scene Parsing - establishing a complete understanding of a scene where we have information about what object is at each pixel and how these objects occlude each other
Computer Vision Tasks Identity Recognition - having found an object (e.g. a face) draw inferences about whether it is a particular face Image Enhancement - increase the resolution of the image (super-resolution), remove noise (denoising), or fill in missing areas (in-painting) Generation - having learnt something about a type of object or scene, use the model to generate new images of this type of object Object Tracking - apply a dynamic model to follow an object and monitor changes in its appearance over time Object Description - establish characteristics of an object once we have found it. For example, establish the sex, age or expression of a human face.
Applications of Computer Vision Optical Character Recognition Robotics Security
Applications of Computer Vision Augmented Reality Image Retrieval
Applications of Computer Vision Medical Image Analysis Industrial Inspection Model Building
Applications of Computer Vision Autonomous Vehicles Military Applications
Relationship with other fields Also a very close relationship with graphics – vision is inverse graphics!
Brief History of Computer Vision • 1970s: the lack of computing power dictated algorithms, emphasis on low-level vision, and binary images • 1980s: close relationship with researchers investigating animal vision • 1990s:geometry of multiple views of the same scene, and estimation of camera pose and scene geometry.
Brief History of Computer Vision 2000s: several trends have emerged • Machine learning and vision have grown much closer • Discrete optimization techniques have found widespread use • Much more emphasis on benchmarking and quantitative evaluation • Trend towards larger training datasets
Why is computer vision hard? 1. Dimensionality of Input Space Consider a RGB image at VGA (640x480) resolution with a 8 bit pixel intensity resolution (256 gray levels). The number of total possible images is 256^(640x480x3) or roughly 10^90000. Conclusion: Almost none of the possible images have ever been seen!
Why is computer vision hard? 2. Vision is an inverse problem The mapping from scenes to images is many to one: the data we receive is non-unique. How can we possibly establish what is out there? 3. Speed VGA camera at 30Hz receives 27Mb per second of data – how can we process all this?
Some Mitigating Factors • We know a lot about the generative model (graphics) • We usually have a lot of prior knowledge about what we expect to see in the image (helps with non-uniqueness) • There is a large amount of training data readily available.
Course Overview • Geometry of a Single Camera • Image transformations • How to 3d points project to pixels • Special cases of imaging (A) (B) (C) (D) Augmented Reality Tracking Image Mosaicing
Course Overview • 2. Geometry of multiple cameras • Stereo vision • Epipolar geometry • Finding and matching distinctive keypoints • Shape from silhouette Models from Sparse Stereo Vision Shape from silhouette
Course Overview • 3. Inference at Individual Pixels • Generative approach • Parametric vs. non parametric models • Mixture Models Colour Based Segmentation
Course Overview • 4. Markov Random Fields – Connecting Pixels • MCMC Methods to solve MRFs • Exact MAP Inference in MRFs (graph cuts) • Binary vs. Multi-label cases (A) (B) (C) Dense stereo vision
Course Overview • 5. Models of Texture • Models of small (~ 5x5) patches of pixels • Repairing natural images • Texture synthesis Image In-painting
Course Overview • 6. Models of Objects • Model larger regions of the image • Generative models for pixel covariance • Factor analysers, mixtures of factor analysers Face Detection
Course Overview • 7. Sparse Models of Objects • Model only sparse, but distinctive features • Bag of words model • Constellation models Object Class Recognition
m+2(F:,3) m+2F(:,1) m+2F(:,2) m+2(G:,2) m+2G(:,3) m+2G(:,1) Course Overview • 8. Face Recognition • Subspace models for recognition • Within- and between- individual variance • Recognition across pose vs. Face Models
Course Overview • 9. Models of Shape • Point distribution model • Active Shape Models • Active appearance models Active Appearance Models Active Shape Models
Course Overview • 10. Tracking • The Kalman Filter • Extensions of the Kalman Filter • Particle Filtering Examples of Tracking Objects