1 / 24

DATA COLLECTION AND IMPROVING DATA QUALITY

BY : LISSY VERMA SHRADDHA GUPTA. DATA COLLECTION AND IMPROVING DATA QUALITY. OUTLINE. Data Collection ODK : Open Data Kit Demo Usher : Improving Data Quality Purpose Implementation Results. DATA COLLECTION. Data collection in developing areas is difficult.

chesna
Download Presentation

DATA COLLECTION AND IMPROVING DATA QUALITY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BY : LISSY VERMASHRADDHA GUPTA DATA COLLECTION ANDIMPROVING DATA QUALITY

  2. OUTLINE • Data Collection • ODK : Open Data Kit • Demo • Usher : Improving Data Quality • Purpose • Implementation • Results

  3. DATA COLLECTION • Data collection in developing areas is difficult. • None of existing tools suffice. • Based on need, new features are needed.

  4. OPEN DATA KIT • ODK is a tool suite for collection and management of data on mobile phones. • The main objective is to provide open source tools.

  5. OPEN DATA KIT • ODK COLLECT • Collects Data • ODK AGGREGATE • Store Data, view and export. • ODK MANAGE • Remote Device Management

  6. A QUICK DEMO

  7. AMPATH • AMPATH deployed the ODK for data collection for medical purpose. • Deployment was found to be successful minimizing delays and improving lives of healthcare workers and other people.

  8. Data Collection is Challenging • Expertise in form design • Double Entry : Costly • Data Cleaning

  9. Past Work Constraints • Combo-boxes. Reduce Time • Automatically filled Leave-forms.

  10. USHER: Improving Data Quality ESCORTER : Guide towards correct entries. • Question Ordering in form. • Greedy Information Gain • Dynamically Reorder Questions • Predict Errors to Re-ask. • Contextualized Error Likelihood Principle.

  11. CURBSTONING • Concept : An unscrupulous door-to-door surveyor Shirks Work, ask only important questions. • Greedy Information Gain • Uniform Prior : Equal likely inputs • Training Set • Context – specific Model Required • Bayesian Learning

  12. DATASETS • The patient dataset collected at a rural HIV/AIDS clinic at Tanzania. • Survey dataset, responses from 1986 poll about race and politics

  13. Probabilistic Relation : Form Questions Bayesian Network for the patient dataset

  14. Question layout generated by the algorithm

  15. Re-ask Questions Approximates Double Entry • Uncertainty : High Entropy • Outliers

  16. Data-entry Feedback

  17. Usher Components And Data-flow

  18. Error Modeling

  19. Accurate Prediction Results

  20. THANK YOU

  21. SUPPLEMENTARY SLIDES

  22. DATA COLLECTION : PROBLEMS • Due to digital divide between the developing and developed areas, it is very difficult to collect and use data in the developing regions. • The main problems being : Lack of reliable infrastructure,Proper connectivity, and,Inadequate expertise. • Currently available tools for data collection like Pedragon Forms, Nokia Data Gathering, Java-Rosa, RapidSMS etc. are difficult to deploy, hard to use, complicated to scale and rarely customizable.

  23. OPEN DATA KIT • The Open Data Kit or simply ODK is a suite of tools for data collection that uses Google’s Android platform. • The main objectives of the technology are : Modularising and customising toolsUse of open interfaces and standardsLong time survival of tools. • The three components of ODK are:1. ODK Collect : collects data using Forms.2. ODK Aggregate : ready to deploy online repository to store, view and export collected data.3. ODK Build : enables users to generate forms.4. ODK Voice : maps Forms to sound snippets.5. ODK Clinic : mobile medical record system.6. ODK Manage : maintains database of all phones for remote device management7. ODK Validate : validates Form.Other tools being ODK Dropbox, ODK Rangefinder, ODK Tasks, ODK Listen and ODK Visualise.

More Related