400 likes | 496 Views
A TEDMED Data Reveal: Big and Little. Dr. Brand Niemann Director and Senior Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ April 22, 2013 http://semanticommunity.info/A_TEDMED_Data_Reveal. Background.
A TEDMED Data Reveal:Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ April 22, 2013 http://semanticommunity.info/A_TEDMED_Data_Reveal
Background • I did a story about TEDMED 2012 for the 2012 Health Datapalooza III and was invited to go to TEDMED 2013 as a Journalist! • Session 2: “How Can Big Data Become Real Wisdom?” and Session 6: “Going Farther while Staying Closer” were the most interesting and motivating to me. See next slide. • I heard about Big and Little Data and saw an opportunity to help TEDMED with a taxonomy that is a semantic Index to a knowledge base for improved search and to help TEDMED with examples of big and little data science. • And the best data source for my work was Professor Christopher Murray’s (IHME/GBD) presentation and demonstration on “What does a $100 million public health data revolution look like?” funded by the Bill and Melinda Gates Foundation to prioritize global health research and help. • It made me think of the Monica Rogati’s Tweet @ Strata 2012: More data beats clever algorithms but better data beats more data. • I Tweeted: @TEDMED @Storify Yes, and working on IHME/GDB (Global Burden of Disease) Visualizations like: http://semanticommunity.info/Census_Data_Visualization • But I want to volunteer to help TEDMED 2013 and 2014 as a data scientist/data journalist and saw on their Web site: If you are a talented designer and/or illustrator with experience in bringing presentations to life, you could help with our speaker presentation materials. • I attended the First Great Challenges Day, participated in the Inventing Wellness Programs Breakout Session, and learned the importance of scientists storifying with “and, but, and therefore”. • Therefore my story is a TEDMED Data Reveal: Big (IHME/GBD) and Little (TEDMED Web Site) with “and, but, and therefore.”
My TEDMED 2013 Highlights • SESSION 2: How Can Big Data Become Real Wisdom? • Jay Walker: Introduction. • Need a macro-scope to gather, network, store, and access data and to go from data to wisdom by finding patterns in the data. • Larry Smarr: Can you coordinate the dance of your body's 100 trillion microorganisms? • How to quantify self movement with medical detail in real time by an astrophysicist turned computer scientist. • SESSION 6: Going Farther while Staying Closer • Christopher Murray: What does a $100 million public health data revolution look like? • Talk and live demo of Global Burden of Disease Treemap, Map, Time Plot, Age Plot, and Stacked Bar Chart by Age and Sex. See: http://blog.tedmed.com/
TEDMED 2013 My Note: I decided to make this a Searchable Knowledge Base. http://www.tedmed.com/
TEDMED Knowledge Base Google Chrome: Find http://semanticommunity.info/A_TEDMED_Data_Reveal
TEDMED Speakers My Note: I decided to make this a little data set for faceted search. http://www.tedmed.com/speakers
TEDMED Speakers Spreadsheet My Note: The facets are Year, Keywords, and Tags. http://semanticommunity.info/@api/deki/files/23881/TEDMED.xlsx
Institutions hosting TEDMEDLive 2013 My Note: I decided to make this a little data set for mapping, but it was difficult to get the geo-referenced data set. http://www.tedmed.com/event/tedmedlive?ref=participating
TEDMEDLive 2013 Institutions Spreadsheet My Note: Simple Geo-referencing of Institutions. http://semanticommunity.info/@api/deki/files/23881/TEDMED.xlsx
TEDMED 2013: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire
TEDMED 2009-2012: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire
Institute for Health Metrics and Evaluation (IHME) My Note: There are three Web site My Note: I heard this talk and decide to work with this big data. http://www.healthmetricsandevaluation.org/
Press Release • The Global Burden of Disease (GBD) is a first-of-its-kind study of health around the world. • The GBD findings present a new way to look at health, allowing countries to track progress against diseases ranging from malaria to cancer to diabetes, identify risks including smoking and poor diet, see how people in 187 countries are faring in terms of health and gauge emerging health challenges. The GBD is a collaboration of nearly 500 researchers in 50 countries, and is led by IHME, part of the University of Washington. • Some of the countries included in the GBD, such as the UK and Indonesia, already have started to produce their own policy recommendations as a result of the study. Australia and China are also planning to produce studies that use GBD to drill down and develop local-level health data. • IHME is working with three localities in the US to produce GBD-type data at the community level as well. • Efforts are underway to provide continuous updates to the GBD and expand the range of health issues included in the study. • The GBD measures health issues around the world through more than 1 billion pieces of data that can also be explored through interactive visualization tools online. http://semanticommunity.info/@api/deki/files/23885/TEDMED-media-advisory-Chris-Murray.docx
GHDxCatalog of Demographic and Health Data by IHME My Note: Download Data http://ghdx.healthmetricsandevaluation.org/
Global Burden of Disease Study 2010 Data Downloads My Note: I downloaded 17 files totaling 1.13 GB. Two Codebook files were damaged and I repaired them. http://ghdx.healthmetricsandevaluation.org/global-burden-disease-study-2010-gbd-2010-data-downloads
GBD Compare My Note: Treemap and Map. http://viz.healthmetricsandevaluation.org/gbd-compare/
GBD Cause Patterns My Note: Stacked Bar Chart. http://www.healthmetricsandevaluation.org/gbd/visualizations/gbd-cause-patterns
GBD Cause Patterns: Reports http://www.healthmetricsandevaluation.org/gbd/visualizations/gbd-cause-patterns#/publications-presentations/reports
IHME-GBD Causes of Death: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire
IHME-GBD Life Expectancy: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire
IHME-GBD Mortality: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire
IHME-GBD Risk Factors: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire
IHME-GBD Breast and Cervical Cancer: Spotfire Navigation and Metadata Bar Chart World Map My Note: Data Visualizations are Linked. Data Set https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire
Data Ecosystem: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire
IHME-GBD Life Expectancy by Country: Spotfire Filters Navigation and Metadata Code Book Life Expectancy by Region Life Expectancy (LE) Versus Health Adjusted Life Expectancy (HALE) My Note: The Visualizations Are Linked to One Another. Details-on-Demand My Note: 19 files totaling 1.13 GB of data in a Spotfire file of only 0.5 GB! Data Set https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD-Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire
Conclusions and Recommendations • My story is a TEDMED Data Reveal: Big (IHME/GBD) and Little (TEDMED) with “and, but, and therefore.” • I have done as Jay Walker suggested: We need a macro-scope to gather, network, store, and access data and to go from data to wisdom by finding patterns in the data. • But to do that, TEDMED needs a taxonomy that is a semantic index to a knowledge base for improved search and help with examples of big and little data science. • I found the best big data source for my work was Professor Christopher Murray’s IHME/GBDfunded by the Bill and Melinda Gates Foundation to prioritize global health research and help. • But I found I could improved the access and simplify the visualizations of the IHME/GBD data. • Therefore, I did both of the above and volunteered to help TEDMED 2013 and 2014 as a data scientist/data journalist.
Data Visualizations http://www.healthmetricsandevaluation.org/tools/data-visualizations?page=3
GBD Data Visualizations Spreadsheet My Note: See All 13 Tabs. http://semanticommunity.info/@api/deki/files/23881/TEDMED.xlsx
GBD Data Visualizations Inventory My Note: Download 36 flies totaling 19 MB and selected a few for visualizations.
Diabetes Prevalence by County (US) Maps My Note: I used this in my 2013 Health Datapalooza IV Submission and the TheSanofi US 2013 Data Design Diabetes Innovation Challenge – Prove It! http://www.healthmetricsandevaluation.org/tools/data-visualization/diabetes-prevalence-county-us-maps#/overview/explore
Research Articles My Note: Research Article. http://www.healthmetricsandevaluation.org/tools/data-visualization/diabetes-prevalence-county-us-maps#/publications-presentations/publications
Research Articles My Note: Included this in the Knowledge Base. http://www.pophealthmetrics.com/content/8/1/26
Datasets My Note: Downloaded this dataset. http://www.healthmetricsandevaluation.org/publications/summaries/novel-framework-validating-and-applying-standardized-small-area-measurement-s#/data-methods
Diabetes prevalence rates by age, sex, and county, 2008 (21KB* xls) *My Note: Actual size is 556KB. My Note: Needed to be separated into county and state. http://www.healthmetricsandevaluation.org/sites/default/files/datasets/diabetes_prevalence_by_county_rank_age_and_sex_2008_US_IHME_1010.xls http://ghdx.healthmetricsandevaluation.org/sites/ghdx/files/record-attached-files/IHME_USA_DIABETES_BY_COUNTY_2008.xls
Metadata My Note: Another Excel file name, but same file. http://ghdx.healthmetricsandevaluation.org/record/united-states-diabetes-prevalence-county-2008
IHME Diabetes County 2009: Spotfire Navigation and Metadata Higher Female Than Male Diabetes Prevalence Map Top 10 Counties With High Prevalence of Diabetes Data Set https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Diabetes-Spotfire
IHME-GBD Mortality by Country: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire
IHME-GBD Disability Factors by Health State: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire
IHME-GBD Risk Factors by Region: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD-Spotfire
IHME-GBD Cause of Death by Region: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire