120 likes | 294 Views
This presentation contains video demos of 2 Computer Vision-based applications - a Document Scanner and Video Lecture Note Taker. The Computer Vision components used in each are outlined and the libraries used to build each component are detailed.
E N D
Developing Computer Vision Applications for Android Video Lecture Notes Generator Document Scanner Teyvonia Thomas
Document Scanner Features ●Automatic Document Border Detection ●Automatic Rectification (Converts image of Document to Fronto-parallel view) ●Automatic Document Enhancement ●Other Document Enhancement as well as Black/White, Grayscale, Brightening, Contrasting Enhancements ●Automatic Text Detection: User can then share text in documents ●PDF Generation ●Share PDF or multiple images of Document ●Multi-page Scanning ●Multi-document Generation
Document Scanning Pipeline Canny Edge Detection Contour Extraction Input RGB Image Crop and Brighten Output T
Document Enhancement Black/White Enhancement (using Thresholding) Adaptive Thresholding Simple Binary Thresholding Brightness (β) and Contrast (α) Enhancement output_img(i,j) = α • input_img(i,j) + β
Video Note Taker Features ●Automatically extracts unique pages of notes from videos with thousands of frames ●e.g. the 2 unique pages of notes were extracted in a few seconds from the 8 minute video lecture on the previous slide ●Pages are displayed as thumbnails below the video lecture ●Clicking on a page thumbnail takes the user to a full page view ●Clicking on any line in the notes (from the page view) takes user back to the point in the video the line of notes were being written and lectured about by the lecturer (Dynamic video lecture seeking)
Feature Detection, Descriptor Extraction, Descriptor Matching FAST Feature 1. Feature Detection using FAST (Features from Accelerated Segment Test) Features 2. Descriptor Extraction using BRISK/ORB/FREAK FREAK Descriptor BRISK Descriptor Binary descriptor is composed out of three parts: 1. A sampling pattern: where to sample points in the region around the feature. 2. Orientation compensation: some mechanism to measure the orientation of the keypoint and rotate it to compensate for rotation changes. 3. Sampling pairs: the pairs to compare when building the final descriptor. 3. Binary Descriptor Matching Hamming Distance = sum(XOR(string1,string2)) • e.g. H. Dist between 1011101 and 1001001 is 2.
Video Note Taking Method Feature Detection and Matching in Consecutive Frames New Page Detection and Generation based on ratio of features matched between consecutive frames and average displacement of corresponding features (for boards that “move” during lectures)
Computer Vision Libraries for Android used to create the 2 Apps ●OpenCV4Android SDK ●Tesseract (“tess-two”)
What's in OpenCV? Image Segmentation Face Detection People Detection Image Stitching Object Detection and Matching Image Inpainting Background Subtraction Motion Tracking