90 likes | 245 Views
Hackathon Challenge: (Semi-) Automating DNA Collection. Sara Farmer Noah Hofmann-Smith Jonathan Undy. Outline. Need to assess country preparedness on onset of disaster QUICKLY. Lots of sources, but is not machine accessible. Motivation. Websites: Html, xls , csv , apis etc.
E N D
Hackathon Challenge: (Semi-) Automating DNA Collection Sara Farmer Noah Hofmann-Smith Jonathan Undy
Outline Need to assess country preparedness on onset of disaster QUICKLY. Lots of sources, but is not machine accessible.
Motivation Websites: Html, xls, csv, apisetc Template Creator Analyst DNA Partially-filled indicators spreadsheet Researchers Completed indicators spreadsheet
Outline 2 Process for automation: Noah and Jonathan Sara and team (partially completed already)
Loading from CSV files to Excel (Noah & Jonathan) Challenges: • Key indicators referred to differently by different sources • Several years’ worth of data • Countries not included in all datasets
Challenges going forward • Improving data quality. (E.g. unpacking compound data items from the same field.) • Continue to develop the standard list of indicators. • “Close the loop”. Eliminate manual cleaning of the scraped data.