230 likes | 425 Views
Kiva Dataset XDATA Summer Session 2013. Haesun Park, Georgia Institute of Technology Jeff Baumes, Kitware, Inc. Visual solutions. Kiva visualization c hallenges. How do we deal with data heterogeneity? How do we make geospatial correlations? How do we obtain meaningful insight?.
E N D
KivaDatasetXDATA Summer Session 2013 Haesun Park, Georgia Institute of Technology Jeff Baumes, Kitware, Inc.
Kiva visualization challenges How do we deal with data heterogeneity? How do we make geospatial correlations? How do we obtain meaningful insight?
visualization challenge Heterogeneous data
Fast ingest MDA, JPL, ISI Tested fast ingestion of raw JSON files Ingestion to faceted database took 36 min with > 99.999% data automatically sucessfully ingested
Kiva Data to Data Table Categorical Variable Continuous Variable Missing Values Many pieces of information regarding lenders, borrower, partner and loan are related to a lender to loan transaction.
Column dependency matrix Country & lender loan count Participation & rate of default Significant dates & status Amounts & number of lenders BayesDB identifies dependencies among the 40 attributes BayesDB– SSCI, D3/Tangelo – Kitware http://localhost:8080/xdata/ssci/
Conditional predictive distributions 10 Loans 100 Loans USA USA If we only know a loan transaction is from an entity making a particular number of loans, what is the entity’s likely country of origin? For most loan counts, not surprisingly, USA is most likely. However, if the loan count is very large, Belgium is even more likely than USA as the entity’s country of origin. Probability 10,000- Loans 100,000- Loans Confidence begins to decrease USA Belgium USA Norway
Linked interaction Linked attribute view reveals Togo’s loans are mainly concerning food Sorting by loan description length shows loans from Togo have length zero Jigsaw – Kitware (Georgia Tech)
Word trees Jigsaw – Kitware (Georgia Tech)
Community finding MDA, JPL, ISI Workflow (left) includes analytic and visualization steps built with state-of-the-art open source software, result (right) highlights specific individuals and subcommunities of interest.
visualization challenge Geospatial correlation
Lender-borrower map Vega is a grammar to aid non-programmers in developing visual applications Vega - Kitware (Stanford)
Lender-borrower map D3 + Google Maps has more flexibility but takes much more experience to develop D3- Kitware
3D globe Visualization Toolkit - Kitware
visualization challenge Meaningful insight
Dynamic lender ranking LineUp - Kitware (Harvard)
Following money flow Interactive, analyst-driven transaction app for big data Alternative to node-link diagrams Influent - Oculus
What would encourage lender participation? Plot number of lenders with at least one loan over time, then flilter down to high-volume regions Recommendation: advertise aroundInternational Women’s Day Neon - NextCentury
Where are defaulted high-value loans? Plot high value loans on map, look at details in a sorted table Surprisingly, the top defaulted high-value loans are from the United States Neon - NextCentury
Parsons New School Placeholder: talk about their feedback on Vega UI and discuss improvements made
Accomplishments Summary of accomplishments