1 / 21

Overview

Overview. Market Leader: “Intelligent Capture & Exchange” Solutions. Information comes in many forms…. Structured Content Information is predictable Location of information is predictable. Examples: Waybill Traffic Citations Tax Forms Mail Order Forms Applications

julie
Download Presentation

Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview Market Leader:“Intelligent Capture & Exchange” Solutions

  2. Information comes in many forms… Structured Content • Information is predictable • Location of information ispredictable • Examples: • Waybill • Traffic Citations • Tax Forms • Mail Order Forms • Applications • Insurance Claims

  3. Information comes in many forms… Semi-Structured Content • Information is predictable • Location of information isNOT predictable • Examples • Accounts Payable • Accounts Receivable • Transportation • Bills of Lading • Medical Billing

  4. Information comes in many forms… Unstructured Content • Information is NOT predictable • Location of information isNOT predictable • Examples • Mortgage Folders • Medical Records • Email Classification • Digital Mailroom • Litigation Support

  5. Where Did Kofax Classification / Separation Originate? Was funded by In-Q-Tel, the joint venture capture startup group owned by the CIA.

  6. Enabling the automation of Document Classification Processes • Processing millions of captured foreign documents • Automating the categorization of content to expedite linguistic activities • Connecting to an internal content management solution Transformation Modules

  7. Kofax Transformation - Advanced Document Separation • Automatically identify document type and individual document boundaries (start/end) within a batch of multiple documents • Goal: Perform separation/recognition just as if physical separator sheets were inserted between each document • Utilizes multiple approaches in classification and separation in a waterfall approach.

  8. KTM Advanced Document Separation Process KTM Advanced Document Separation Typical Process Flow Extraction 1 Document Scan/ Extraction Classify & Review Data Validation Release Import Separate

  9. Vector Space Machines Under the Hood Warning: The following slides may require pocket protectors.

  10. Automatic Document ID and Indexing S 90% E 90% S 65% M 70% E 85% S 72% E 80% S 85% E 50% S 55% M 65% E 70% S 70% E 75% E 22% M 15% S 12% M 10% E 65% S 12% E 30% S E S M E S E

  11. Automatic Document ID and Indexing (Automatic) Document ID & Index g • g g • g g g g g g g g g g g g g • g g g g g g g g g g g g g g g g g g g g g Date SSN Last Name Page Identification Document Separation S E S M E S E Index

  12. Automatic Document ID and Indexing (Automatic) Document ID & Index g • g g • g g g g g g g g g g g g g • g g g g g g g g g g g g g g g g g g g g g Date SSN Last Name Page Identification Document Separation S E S M E S E Index

  13. Classification “Waterfall” Technique Barcode Result: ? ? ? ? ? ? ? INDICIUS Barcode Recognition Image Result: ? ? ? ? ? N/A INDICIUS Image Classification Patterns Result: ? ? N/A N/A N/A INDICIUS Pattern Matching N/A N/A N/A N/A N/A N/A mC Result: mohoClassifier (mC) Using multiple classification engines: • Performance is optimized by attempting fastest classification techniques first, accepting results only if very confident • Mohomine text classification is used as “catch all” method—very accurate with widest reach, but dependent on full-page OCR 3 4 5 6 7 8 1 2 Page # First Form X 1 ms First Form Y First Form Z 20 ms Last Form X Last Form Y Last Form Z 200 ms Middle Form X Middle Form Z 1000 ms

  14. How do we actually build a model? Business Dictionary SAN JOSE, Calif. (AP) -- One week after firing its top executive, Hewlett-Packard Co. reported quarterly earnings that were essentially flat, and the interim chief executive acknowledged, ``There is work to be done.'' For the three months ended Jan. 31, HP reported a profit of $943 million, or 32 cents per share, only 0.7 percent higher than the $936 million, or 30 cents per share, it earned in the first fiscal quarter… NEW YORK (Reuters) - Former WorldCom Inc. finance chief Scott Sullivan, who has become the star witness against Bernard Ebbers, admitted on Wednesday to a history of lies, saying he had deceived shareholders, analysts and the board while his staff undertook an $11 billion accounting fraud. Sharply questioned by the lead attorney for Ebbers, the one-time chief executive officer … Sports Saying this was a "sad, regrettable day," Commissioner Gary Bettman announced today that the National Hockey League was canceling the season because negotiators had failed to come to an agreement with the players' union on salary caps. With his announcement, the N.H.L. becomes the first major pro sports league in North America to lose an entire season to a labor dispute… PARIS (AP) -- Still hungry to race but wary he is not in the best shape, Lance Armstrong wants to take his Tour de France record to even mightier heights: He will try for a seventh straight title this summer. Armstrong had left open the possibility he wouldn't compete this year in cycling's showcase event to pursue other races. But in an announcement Wednesday on the Web site of his Discovery Channel team the Tour's only six-time winner… Technology A new battery-powered Etch A Sketch will rely on digital electronics for a speedy interpretation of each knob twist. It is designed, its makers say, to transmit data along a wire plugged into a television set that will display every line and detail in real time, with accompanying sounds and optional color. It will cost $20, twice the price of the traditional Etch A Sketch. "I think the kids are becoming more advanced in… SAN FRANCISCO, Feb. 15 - Late in the summer of 1973, two young scientists in the nascent field of computer networks hunkered down in a conference room of the Cabana Hyatt Hotel in Palo Alto, Calif., a clean but bland stopping place for salesmen and the parents of students at nearby Stanford University. Their goal was to thrash out a way to make different, isolated computer networks talk to each other….

  15. The Problem: Document Separation Separation of unstructured documents is a significant expense for a high volume capture system • Typical ‘structured’ recognition technologies are not applicable • Manual insertion of separator sheets is the primary solution today • 50% of document preparation labor spent sorting documents and inserting separator pages Where does one document stop and the next begin? Here? Here? Here? SS

  16. How Document Separation Works Separation Middle Form Y (53%) Last Form R (81%) First Form C (85%) Middle Form X (69%) Last Form E (98%) Middle Form C (17%) First Form Y (75%) Middle Form X (92%) First Form C (27%) Last Form R (92%) 1 2 3 4 5 Page # X X mC Result: Middle Form X (92%) Last Form Y (95%) First Form X (97%) First Form Y (84%) Last Form X (95%) FSM Constraints: • A “First” page must be followed by “Middle” or “Last” of same type • After a “Last” page must come a “First” • Custom Business Rules Best Path Analysis: Form X Form Y

  17. Customer Success Story • Residential mortgage processing, 12 Million images/month • Each customer folder: ~100 pages, 60-80 doc types • Before automatic document separation • 60 people doing document separation and preparation • 16 people to review (QC) a customer folder • 8.25 minutes per folder to review • With automatic document separation • 10 people doing document separation and preparation • 3 people to review (exceeded goal to reduce staff to 8) • 2 minutes per folder to review • Exceeded processing goal targets at each step • $420,000 annual savings in labor • $100,000 annual savings in separator sheet consumables

  18. Capabilities Overview • Classification • Content (text) • Layout (topography) • Combination of the above • Extraction • Rules (format, database) • Learn-by-example • Templates • Any document • Structured (inc. legacy forms) • Semi-structured, e.g. invoices • Unstructured documents, e.g. correspondence

  19. Key Applications/Use Cases • Invoices (AP automation) • Speed up AP process and reduce manual keying • Pre-configured solution already available • Sales Orders • Improve sales order process and accuracy • ‘Mailroom’ applications/Workflow automation • Automatic classification and routing • Indexing (<= 3 fields) for archive • No need for pre-sorting • Image to archive automation • Automatic classification and indexing for storage in dm system • ‘Better, quicker, more accurate batch capture’ • Business process automation • Full data capture • Straight thru processing • Semi-structured and unstructured documents • Invoices and credit notes • Correspondence • Reports

  20. Kofax KTM Differentiators • Integrated with Kofax Capture (offering HA, xx) • Learn-by-example extraction • Learn-by-example classification • Continuous supervised learning in production • Single product for all document types that is upgradable

  21. Kofax Solution Strengths • Market leader • Out-of-the-box • Unlimited import options • VRS integrated with “QC Later” • Better Recognition/Multiple Document Types • API Integrated export • Secure handling of images & data • Out-of-the-box reports • You won’t outgrow it Kofax Capture Overview

More Related