290 likes | 385 Views
Progress Report Presentation. Recovery.gov Visualization. Using Dual Treemap for Bi-Hierarchical Exploration. Rachel Schwartz Puneet Sharma Miguel Rios Tak Yeon Lee. Introduction Recovery Act Existing Visualization Task Analysis Newspaper Headlines Spotfire Tryout
E N D
Progress Report Presentation Recovery.gov Visualization Using Dual Treemap for Bi-Hierarchical Exploration Rachel Schwartz Puneet Sharma Miguel Rios TakYeon Lee
Introduction Recovery Act Existing Visualization Task Analysis Newspaper Headlines Spotfire Tryout Concept Demo Schedule
About Recovery Act • Signed into law on Feb. 17, 2009 by President Barack Obama • Total $787 billion • 28 Agencies • Contract, Grant and Loans • 130,362 recipient reports: • 13,080 reports on contracts, • 116,675 on grants, • 607 on loans
What is Recovery Act Report Agency Report Agency … … • Plan • How the money is DISTRIBUTED Contract / Grant / Loan • Who Received the money • How the money is USED • How many jobs are created Recipient Report Prime Recipients Sub Recipients Recovery.gov Vendors Vendors Vendors Vendors
Existing Visualizations Recovery.gov Eye On the Stimulus Wall Street Journal msnbc.com
Existing Visualizations Recovery.gov Eye On the Stimulus Wall Street Journal msnbc.com
Existing Visualizations Recovery.gov Eye On the Stimulus Wall Street Journal msnbc.com
Existing Visualizations Recovery.gov Eye On the Stimulus Wall Street Journal msnbc.com
Existing Visualizations • Summary. • Geographical map and Table are most common • Browsing is the main activity being supported • Comparison is often considered but not fully supported • Not Suitable for Analytic Task
Examples of Analytic Tasks on Recovery Act Investigate Journalists do find headlines out of Recovery Act Report
Examples of Analytic Tasks on Recovery Act • How do Journalists find headlines? • State / County-wise comparison is common • Census Data is useful for • Finding states/counties in similar context • Validating fairness of funding • (Is the money given to the place where needs it?) • Two hierarchies • Agency hierarchy • Spatial hierarchy • are equally important
Bi-Hierarchical Structure AGGREGATION DISTRIBUTION Agency STATE COUNTY Projects CENSUS data Prime / Sub Recipients 1. Agency Tree 2. Spatial Tree * Suitable for tracking how the money is distributed * Associated with Industry, Recipient information * Intuitive * Associated with Census Data How to support EXPLORATION within/across both hierarchiesis the key point
Spotfire • Purpose. • Understanding the dataset • Exploring capability of current visualization techniques • Modeling Task Flow • Finding our contribution US Census Bureau Data Population Population over 65 Female Percentage White people percentage Black people percentage Hispanic percentage Infant deaths High school graduate percentage Bachelor degree percentage Housing unit Housing unit percent change Median household income People in poverty labor force unemployment rate number of firms women-owned firms percentage Why Spotfire? Spotfire has most features for multivariate comparison tasks Agency Report Recipient Report Contract Grant Loan
Spotfire / findings. The most effective Job Creators are questionable. Agency – Recipient State - County $98.84 per job $20.94 per job $417.61 per job Size by Award Amount Colored by money Per Job (how much money they spent for creating each job) Dataset Filtered by money Per Job : $10~$2000
Spotfire / findings. Florida ,the biggest senior town in US, Gets Most Money From Military
Critics on Spotfire • Brushing is great, but highlighted cells are not showing how much is the related portion LOS ANGELES highlight Actual Portion of the selection select
Critics on Spotfire • Color Scheme and Filter cannot disclose data inconsistency It can only filter out empty values How can we highlights elements having missing/invalid values? Color Scheme provides basic linear spectrum only
Critics on Spotfire • Basic Color Scheme works poorly with an exponential distribution Only two extremely highest values are distinguished Basic Color Scheme supports linear spectrum
Critics on Spotfire • Using Filter takes much time and efforts • Trellis needs more flexibility • Comparison between a state and a county is not easy • (Future work) Sort-by-Attribute, Cluster-by-Feature and more possibilities
Concept. Synchronized Dual Treemap for Exploring Bi-Hierarchical Data • Contribution to Recovery Act Accountability and Transparency. • Providing Analytic Toolfor Citizen Watchdogs-Supporting Sense-making Process of Dataset • Contribution to Treemap Visualization and Spotfire. • Improving Brushing with Proportional Highlight • Improving Filter Interaction • New Features of Trellis
Task flow Exploring Bi-Hierarchical data OVERVIEW Understanding General Trends on Dual Treemap Narrowing Down by Zoom-In Filter ZOOM IN & DYNAMIC FILTER Dynamic Filter by Project/Regional Attributes TAKE SNAPSHOTS Keep Treemap Snapshots for Comparison DETAIL COMPARISON Find Patterns and Outliers of the Treemaps in the Shoebox
Basic UI Layout [All Agency] | Project [All State] | County Filter / Color Scheme Tennessee New York General Services Administration Corps of Engineers Washington Texas Colorado Department of Health and Human Services Department of the Army Idaho DEPT. OF ENERGY South Carolina California Details Shoebox D. ENERGY Washington Tennessee D. Health … Exec. President NASA D. ENERGY Washington Tennessee D. Health …
Narrowing Down by Zoom-In filter • When Zooming –Into a Treemap, other Treemap is redrawn with filtered data • Zooming-Out removes the filter Agency Tree Spatial Tree Shared Data AGENCY > PROJECT > PRIME/SUB RECIPIENT STATE > COUNTY > Entire data ZOOM IN PROJECT > PRIME/SUB RECIPIENT Department of Energy Filtered by [Department of Energy] STATE > COUNTY > REDRAWN ZOOM IN PROJECT > PRIME/SUB RECIPIENT REDRAWN Filtered by [Department of Energy] & [Maryland] • Maryland COUNTY > ZOOM OUT AGENCY > PROJECT > PRIME/SUB RECIPIENT • All Agency Filtered by [Maryland] COUNTY > REDRAWN ZOOM OUT AGENCY > PROJECT > PRIME/SUB RECIPIENT REDRAWN Entire data • All State STATE > COUNTY >
Brushing and Proportional Highlight Portion of related elements are highlighted select
Keep Treemap Snapshots for comparison Tennessee New York General Services Administration Corps of Engineers Washington Texas Colorado Department of Health and Human Services Department of the Army Idaho DEPT. OF ENERGY South Carolina California Shoebox D. ENERGY Washington Tennessee D. Health … Exec. President NASA D. ENERGY Washington Tennessee D. Health …
Find Patterns and Outliers of Treemap • More advanced features are planned but implementation is not guaranteed • Cluster by Treemap Features (size-color correlation, uniformity, diversity, …) • Sort by Attributes (award amount, population, Job Creation, …) • Snapshot as a bookmark of setting Shoebox
Extended Color Scheme • Based on a set of predefined rules, Treemap elements having empty / invalid values are highlighted. (it overrides standard color scheme based on other attributes) • Users are assumed to have no idea of any inconsistency pattern. Otherwise just normal filter it is. Rule 1. Zip codes not found in standard zip code table Rule 2. Congressional District code not found in the state’s CD code table Rule 3. Agency code not found in standard agency code table Highlight Invalid Data zip code Congressional District code Agency code Other filters
Extended Color Scheme • A linear color scheme is not suitable for exponential distribution. Extended Color Scheme utilizes statistical percentile to separate outliers from main distribution. ^10% ^ 50% ^ 90% ^ Max ^10% ^ 50% ^ 90% ^ Max