180 likes | 201 Views
Learn about the algorithmic approach to decode CAPTCHA images using multivalued image decomposition. Explore the steps involved in text extraction, letter extraction, and image recognition phases. Discover the uses and benefits of this algorithm in various applications.
E N D
Captcha Decoding Using Multivalued Image Decomposition Algorithm Group Members: • Faizan Zahid • Fubha Burney • Ibrahim Ajmal • Syed Haider Raza
Table of Contents • Motivation for the Project ………………………………………………2 • Introduction to CAPTCHAs………………………………………………3 • Breaking CAPTCHA’s ……………………………………………………..6 • Algorithm Methodology………………………………………………….8 • Text extraction phase……………………………………………………..10 • Letter extraction phase ………………………….……………………… 11 • Image Recognition phase ………………………………………………..12 • Algorithm Analysis…………………………………………………………13 • Demo………………………………………………………………………….. 14 • Uses and Benefits…………………………………………………………..15 • References & Future aspects……………………………………………16 • Queries………………………………………………………………….........17
Algorithmic Motivation • To learn & know about : • Latest algorithms & their practical usage • Solving real world problems • Optical Character Recognition (OCR) • Artificial Intelligence (AI) • Curiosity & Awareness • CAPTCHAs behind the scenes • Google’s image search
Introduction to CAPTCHAs • CAPTCHA stands for Completely Automated Public Turing tests to tell Computers and Humans Apart. • In simple terms, "Are you a human?” test . • Used by many websites to prevent bots and to stop spam. • Examples: http://www.slideshare.net/avinash2008/captchappt
Types: • Visual • Audio (Alternative) More examples: http://solvecaptchas.com/wp-content/uploads/2013/05/Captcha_Creator_PHP_Script-358301.gif
Is it possible to crack CAPTCHA’s ? • “ Every defeat is also a victory ” • [1] [1] http://computer.howstuffworks.com/captcha5.htm
Breaking CAPTCHAs • Why ? • Ensure & improve efficiency of CAPTCHA’s • Ethical Hacking purposes • Basic steps: • Convert image to grayscale (Text extraction) • Apply pattern detection (OCR and AI) • Matches based on dictionary or referenced characters.
Algorithm • Extraction Technique : • Color extraction • Algorithm : • Multivalued image decomposition Phases : => Text Extraction Phase => Letter Extraction Phase => Recognition Technique
Methodology • Steps : • Build histogram of colors • Creates new B/W image • Slicing characters • Building Disjoint Sets of Pixels • Vector space & Image recognition • Build Training set Text Extraction Phase Letter Extraction Phase Recognition Technique
Text Extraction Phase Step by step how the Algorithm solves a given problem. • Build histogram of colors • Creates new B/W image Color’s ID & No of pixels Text-only version of CAPTCHA http://www.wausita.com/captcha/
Letter Extraction Phase • Slicing characters • Building Disjoint Sets of Pixels http://www.wausita.com/captcha/ Output :
Recognition Phase • Vector space & Image recognition • Build Training set http://www.wausita.com/captcha/ Output :
Algorithm Analysis Language : Python 2.5http://www.python.org/ Python Image Library http://www.pythonware.com/products/pil/ Technologies : Image recognition & Artificial Intelligence Running Time : (48 CAPTCHAs) 0.42 sec per Captcha 432,000 cracks per day 95,040 success rate
Uses and Benefits • This Algorithm helps us in extracting the text from pictures. • It can be used to make computers smarter and more efficient. • Many documents that are scanned can be converted into word files using this algorithm. • Softwares such as “Ever Note” are practically applying this algorithm to convert scanned results and snapshots to text. • “Scanthing” is another android application that extracts text from pictures and documents and allows you to search for them by keywords on your phone using OCR (Optical CharacterRecognition) https://play.google.com/store/apps/details?id=com.evernote
Future Aspects • It can be integrated with speed cameras so that when someone over speeds, the camera automatically captures the License Plate of the car and automatically runs it through the database after converting the picture to characters. • This technology could replace the bar code technology and will just require that the name of the product is scanned. http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Traffic_enforcement_camera.html
References • Images & Content : • http://www.wausita.com/captcha/ • http://www.howstuffworks.com/captcha.htm • Research & Study : • FYP thesis of BIT-5 • http://la2600.org/talks/files/20040102/Vector_Space_Search_Engine_Theory.pdf • http://stackoverflow.com/questions/1752305/breaking-captchas-for-a-noble-purpose
Demo Live Demonstration of Algorithm
Queries ? Thank you for your patience www.faizanzahid.me