1 / 47

Emerging technologies 2010 Censuses Challenges

Learn about the advancements in data capture technology for census projects, including the eFLOW platform by Top Image Systems. Discover the benefits of automated data capture and how it improves efficiency, accuracy, and flexibility.

bonniew
Download Presentation

Emerging technologies 2010 Censuses Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UN Workshop Thailand 2008 Emerging technologies2010 Censuses Challenges Shoshani Eli Managing Director Asia Pacific

  2. Agenda • Introduction • Who we are? • Data capture methods • Eflow Platform • Summery

  3. “Counted” by eFLOW world wide 1,374,026,304 3

  4. TIS’s Experience in Census Projects India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovak Republic 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 Largest market share worldwide in census projects information capture

  5. 2008 won Belarus Argentina Thailand

  6. Overview - Top Image Systems • Founded 1991 • Data Extraction and Workflow solutions. Specialized in Censuses Project • Since 1996, traded on NASDAQ (TISA) • ~250 employees

  7. Local Offices in the Region: Asia Shanghai, Japan, Singapore, Hong Kong, Guangzhou (R&D) and Australia Europe United Kingdom, Germany, Italy, Spain, France, Benelux America’s Boston, Rio De Jenero • Present in app. 40 countries • Strong partner network worldwide • Around 800 installed systems worldwide

  8. The evolution of data capture in census projects eFLOW From OCR into IDR Solution

  9. The evolution of data capture in census projects Key From Paper Key From Image • Manual data entry (key from paper) • Slow • High error rate in the data entry process • Recruitment, training and management of personnel • key from Image: • Archive • Approx 30-40% faster than key from paper

  10. The evolution of data capture in census projects OMR OMR (hardware readers for checkbox) • Requires specially printed forms and special scanners • Cannot handle handwritten/printed data • Forms are not user-friendly • OMR requires more answers => more space => increased paper expenditures => more handling and printing costs • Not flexible, difficult to adjust to other applications once census is over • No possibility to add business rules: computation, validations, coding

  11. TIS’s Experience in Census Projects India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovak Republic 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 Largest market share worldwide in census projects information capture

  12. The evolution of data capture in census projects Automated data capture Requires less human intervention, enables to complete the census data capture much faster (less space, less salaries, less hardware) Ensures data integrity – enables the use of automatic AND manual: online validations, exception handling, coding The most advanced and proven technology for Censuses, recommended by the UN and used by all modern countries for census projects Full flexibility in the type of data gathered (checkbox, handwritten, alpha and numeric, barcode…) Provides all capabilities of the OMR and plus much more Creates a correlation between the image and the actual form Remote capabilities enable all forms to be scanned locally and then sent to a central site for processing Automated Data Capture eFLOW 12

  13. Intelligent Data Capture The evolution of data capture in census projects Intelligent data capture platform by using OCR/ICR/barcode/PDA/Web/email: • Automated data capture + • Smart - automatic classification for documents • Smart understands and differentiates between various types of documents and languages and Based on state-of-the-art Machine Learning algorithms • Freedom • Artificial intelligence algorithms which provides enough information for the system to find the location of the fields on its own

  14. Unified content Platform Census Data base Suggest a Single platform for all enterprise content

  15. Lessons learned India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovenia 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 15

  16. The customer says it best… Saving of 25% Saving of 12% (Source: CSO – Central Statistic Office Ireland)

  17. The customer says it best… (Source: CSO – Central Statistic Office Ireland)

  18. The customer says it best… Benefits of the eFlow Technology (Source: CSO – Central Statistic Office Ireland)

  19. First, several general lessons… Invest in creating the right application for the project System Design High level business process Functional design Technical/Detailed design Code Guidelines conventions Technical DR, with the R&D Development Project DR Code review Budget control Bi-weekly reports … 19

  20. First, several general lessons… • Spend time on getting the form right • Meet organization standards • Form Design • Prepare and optimize with a pilot • Training & support

  21. Indian Census 2001 TIS partners with CMC, Indian governmental agency with years of experience and offices all over India. Form Processing Technology: Around 500 million A3 images More than 2 million enumerators The technology was implemented at 15 processing centers at major state capitals Data was captured using only 25 high-end Kodak 7520DS Scanners 16 languages The advanced technology in 2001 – eFLOW ver.1.0 Two phases 21

  22. present new advanced technologies to meet 2010 census challenges eFLOW 5.0 – Next Generation…

  23. Main improvements in eFLOW to meet Census Challenges • Architectural changes • Core changes • Recognition technologies • Modules • Features

  24. eFLOW Architectural Improvements • Core redesigned, built in .NET technology • Microsoft .NET is the Microsoft strategy for connecting systems, information, and devices through Web services so people can collaborate and communicate more effectively • Customization by .NET Embedded • Speeds up Runtime – X200 faster • Custom Code now part of CAB • no need to manage DLLs separately • Debug inside eFLOW • No need to install development environment

  25. .Net allows an Object Oriented design approach House Batch Person Batch 25

  26. eFLOW Architectural Improvements Improved flexibility Multiple active applications on the same server (run phases in parallel) balance workload and personnel Ensuring on going work of all team members Multiple sites Support of multiple servers and cluster 26

  27. FormID Export New eFLOW Architecture - Sites

  28. Monitoring and Management

  29. Architectural Improvements (cont.) • Easier management of application: • Control all stations from any location • Automatic stations similar to Windows Services • Remote activation of stations, no need physically access server room • Restart/Start/Control of stations from a centralized place (remotely) using eFLOW Controller and Enterprise manager

  30. Controller

  31. Architectural Improvements • Handling Huge batches: • Ability to handle huge batches of 300-3000 pages each • Ability to process lots of batches in parallel • A stable, robust platform (Pic from eFLOW’s performance test)

  32. Architectural Changes(cont.) • Load balancing • Load balancing between stations (get notifications automatically and better allocation of employees) • Automatic load balancing according to the numbers of batches in a queue • Priority handling - Using the eFLOW capabilities for automatic prioritization by code (for example according to county, region etc)

  33. Architectural Changes(cont.) Improved security mechanism

  34. Advanced approaches • Automatic EFI Matching • Improving template recognition station speed via the “Force EFI” mechanism, a unique barcode posted on each page

  35. Advanced approaches (cont.) • Auto Coding • Coding tasks and data validations performed on the data capture platform: a ‘cost-effective’ solution • Use one of the statistic software's in the market like ACTR (Canadian statistical software for coding some fields) • Use Approximate Search tools for improving results via DB (Exorbyte)

  36. Advanced approaches (cont.) • Dynamic Dictionary update • Lookup and dictionaries via DB (and not txt files) • Export • Reconstruct the original form according to the template

  37. Advanced features(cont.) • Splitting & Merging - Using the build in eFLOW4 splitting/merging mechanism • Handling Problematic batches by Improved Split/Merge abilities • Taking out physically bad pages (or bad household) and continue to work with the rest of the batch • Split/Merge automatically without the need to build a specific station for merging of data • Additional powerful interfaces exposed in the CSM for faster development time • Priority (for example according to county, region etc) • Load balancing between stations (get notifications automatically and better allocation of employees)

  38. Modules • Statistical report • Statistical report to monitor the daily, weekly, monthly rate per user/station • Quality checking using • Licenses • Flexible licenses policy • Per station • Per number of pages processed

  39. Statistic Reporter (e.g Crystal Reports)

  40. Recognition technologies OCR/ICR Engines RICOH (Japanese) PENPOWER (Chinese) LIGATURE JUSTICR ABBYY KADMOS OCE INLITE EXPERVISION OMNIPAGE A2IA TIS NESTOR 40

  41. Custom stations approach

  42. eFLOW Receives Everything • Mobile Devices • MNIC • Web Completion • Remote scanning

  43. Web Completion

  44. eFLOW 4.x Web Completion

  45. Summery • Data capture and IDR platform (paper, electronic, mobile) and not a recognition product • Proven solution in census data capture! no need to invest time and money in new technology and vendor, minimizing the risk • Extensive experience in the design, development and implementation of real census and other high volume form processing projects. Largest market share worldwide in the processing of census projects, • Huge experience based on long researches for the special needs of the Indian Census. • Maximum flexibility, redundancy and robust platform ensuring you meet project timetable to release census results.

  46. Summery India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovenia 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 46

  47. Thank you

More Related