1 / 9

Hackathon Challenge: (Semi-) Automating DNA Collection

Hackathon Challenge: (Semi-) Automating DNA Collection. Sara Farmer Noah Hofmann-Smith Jonathan Undy. Outline. Need to assess country preparedness on onset of disaster QUICKLY. Lots of sources, but is not machine accessible. Motivation. Websites: Html, xls , csv , apis etc.

hinto
Download Presentation

Hackathon Challenge: (Semi-) Automating DNA Collection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hackathon Challenge: (Semi-) Automating DNA Collection Sara Farmer Noah Hofmann-Smith Jonathan Undy

  2. Outline Need to assess country preparedness on onset of disaster QUICKLY. Lots of sources, but is not machine accessible.

  3. Motivation Websites: Html, xls, csv, apisetc Template Creator Analyst DNA Partially-filled indicators spreadsheet Researchers Completed indicators spreadsheet

  4. Outline 2 Process for automation: Noah and Jonathan Sara and team (partially completed already)

  5. Scraping data and CSV files (Sara)

  6. Scrapers

  7. CSV Data Files

  8. Loading from CSV files to Excel (Noah & Jonathan) Challenges: • Key indicators referred to differently by different sources • Several years’ worth of data • Countries not included in all datasets

  9. Challenges going forward • Improving data quality. (E.g. unpacking compound data items from the same field.) • Continue to develop the standard list of indicators. • “Close the loop”. Eliminate manual cleaning of the scraped data.

More Related