1 / 40

A TEDMED Data Reveal: Big and Little

A TEDMED Data Reveal: Big and Little. Dr. Brand Niemann Director and Senior Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ April 22, 2013 http://semanticommunity.info/A_TEDMED_Data_Reveal. Background.

muriel
Download Presentation

A TEDMED Data Reveal: Big and Little

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A TEDMED Data Reveal:Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ April 22, 2013 http://semanticommunity.info/A_TEDMED_Data_Reveal

  2. Background • I did a story about TEDMED 2012 for the 2012 Health Datapalooza III and was invited to go to TEDMED 2013 as a Journalist! • Session 2: “How Can Big Data Become Real Wisdom?” and Session 6: “Going Farther while Staying Closer” were the most interesting and motivating to me. See next slide. • I heard about Big and Little Data and saw an opportunity to help TEDMED with a taxonomy that is a semantic Index to a knowledge base for improved search and to help TEDMED with examples of big and little data science. • And the best data source for my work was Professor Christopher Murray’s (IHME/GBD) presentation and demonstration on “What does a $100 million public health data revolution look like?” funded by the Bill and Melinda Gates Foundation to prioritize global health research and help. • It made me think of the Monica Rogati’s Tweet @ Strata 2012: More data beats clever algorithms but better data beats more data. • I Tweeted: @TEDMED @Storify Yes, and working on IHME/GDB (Global Burden of Disease) Visualizations like: http://semanticommunity.info/Census_Data_Visualization • But I want to volunteer to help TEDMED 2013 and 2014 as a data scientist/data journalist and saw on their Web site: If you are a talented designer and/or illustrator with experience in bringing presentations to life, you could help with our speaker presentation materials. • I attended the First Great Challenges Day, participated in the Inventing Wellness Programs Breakout Session, and learned the importance of scientists storifying with “and, but, and therefore”. • Therefore my story is a TEDMED Data Reveal: Big (IHME/GBD) and Little (TEDMED Web Site) with “and, but, and therefore.”

  3. My TEDMED 2013 Highlights • SESSION 2: How Can Big Data Become Real Wisdom? • Jay Walker: Introduction. • Need a macro-scope to gather, network, store, and access data and to go from data to wisdom by finding patterns in the data. • Larry Smarr: Can you coordinate the dance of your body's 100 trillion microorganisms? • How to quantify self movement with medical detail in real time by an astrophysicist turned computer scientist. • SESSION 6: Going Farther while Staying Closer • Christopher Murray: What does a $100 million public health data revolution look like? • Talk and live demo of Global Burden of Disease Treemap, Map, Time Plot, Age Plot, and Stacked Bar Chart by Age and Sex. See: http://blog.tedmed.com/

  4. TEDMED 2013 My Note: I decided to make this a Searchable Knowledge Base. http://www.tedmed.com/

  5. TEDMED Knowledge Base Google Chrome: Find http://semanticommunity.info/A_TEDMED_Data_Reveal

  6. TEDMED Speakers My Note: I decided to make this a little data set for faceted search. http://www.tedmed.com/speakers

  7. TEDMED Speakers Spreadsheet My Note: The facets are Year, Keywords, and Tags. http://semanticommunity.info/@api/deki/files/23881/TEDMED.xlsx

  8. Institutions hosting TEDMEDLive 2013 My Note: I decided to make this a little data set for mapping, but it was difficult to get the geo-referenced data set. http://www.tedmed.com/event/tedmedlive?ref=participating

  9. TEDMEDLive 2013 Institutions Spreadsheet My Note: Simple Geo-referencing of Institutions. http://semanticommunity.info/@api/deki/files/23881/TEDMED.xlsx

  10. TEDMED 2013: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

  11. TEDMED 2009-2012: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

  12. Institute for Health Metrics and Evaluation (IHME) My Note: There are three Web site My Note: I heard this talk and decide to work with this big data. http://www.healthmetricsandevaluation.org/

  13. Press Release • The Global Burden of Disease (GBD) is a first-of-its-kind study of health around the world. • The GBD findings present a new way to look at health, allowing countries to track progress against diseases ranging from malaria to cancer to diabetes, identify risks including smoking and poor diet, see how people in 187 countries are faring in terms of health and gauge emerging health challenges. The GBD is a collaboration of nearly 500 researchers in 50 countries, and is led by IHME, part of the University of Washington. • Some of the countries included in the GBD, such as the UK and Indonesia, already have started to produce their own policy recommendations as a result of the study. Australia and China are also planning to produce studies that use GBD to drill down and develop local-level health data. • IHME is working with three localities in the US to produce GBD-type data at the community level as well. • Efforts are underway to provide continuous updates to the GBD and expand the range of health issues included in the study. • The GBD measures health issues around the world through more than 1 billion pieces of data that can also be explored through interactive visualization tools online. http://semanticommunity.info/@api/deki/files/23885/TEDMED-media-advisory-Chris-Murray.docx

  14. GHDxCatalog of Demographic and Health Data by IHME My Note: Download Data http://ghdx.healthmetricsandevaluation.org/

  15. Global Burden of Disease Study 2010 Data Downloads My Note: I downloaded 17 files totaling 1.13 GB. Two Codebook files were damaged and I repaired them. http://ghdx.healthmetricsandevaluation.org/global-burden-disease-study-2010-gbd-2010-data-downloads

  16. GBD Compare My Note: Treemap and Map. http://viz.healthmetricsandevaluation.org/gbd-compare/

  17. GBD Cause Patterns My Note: Stacked Bar Chart. http://www.healthmetricsandevaluation.org/gbd/visualizations/gbd-cause-patterns

  18. GBD Cause Patterns: Reports http://www.healthmetricsandevaluation.org/gbd/visualizations/gbd-cause-patterns#/publications-presentations/reports

  19. IHME-GBD Causes of Death: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

  20. IHME-GBD Life Expectancy: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

  21. IHME-GBD Mortality: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

  22. IHME-GBD Risk Factors: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

  23. IHME-GBD Breast and Cervical Cancer: Spotfire Navigation and Metadata Bar Chart World Map My Note: Data Visualizations are Linked. Data Set https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

  24. Data Ecosystem: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

  25. IHME-GBD Life Expectancy by Country: Spotfire Filters Navigation and Metadata Code Book Life Expectancy by Region Life Expectancy (LE) Versus Health Adjusted Life Expectancy (HALE) My Note: The Visualizations Are Linked to One Another. Details-on-Demand My Note: 19 files totaling 1.13 GB of data in a Spotfire file of only 0.5 GB! Data Set https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD-Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire

  26. Conclusions and Recommendations • My story is a TEDMED Data Reveal: Big (IHME/GBD) and Little (TEDMED) with “and, but, and therefore.” • I have done as Jay Walker suggested: We need a macro-scope to gather, network, store, and access data and to go from data to wisdom by finding patterns in the data. • But to do that, TEDMED needs a taxonomy that is a semantic index to a knowledge base for improved search and help with examples of big and little data science. • I found the best big data source for my work was Professor Christopher Murray’s IHME/GBDfunded by the Bill and Melinda Gates Foundation to prioritize global health research and help. • But I found I could improved the access and simplify the visualizations of the IHME/GBD data. • Therefore, I did both of the above and volunteered to help TEDMED 2013 and 2014 as a data scientist/data journalist.

  27. Data Visualizations http://www.healthmetricsandevaluation.org/tools/data-visualizations?page=3

  28. GBD Data Visualizations Spreadsheet My Note: See All 13 Tabs. http://semanticommunity.info/@api/deki/files/23881/TEDMED.xlsx

  29. GBD Data Visualizations Inventory My Note: Download 36 flies totaling 19 MB and selected a few for visualizations.

  30. Diabetes Prevalence by County (US) Maps My Note: I used this in my 2013 Health Datapalooza IV Submission and the TheSanofi US 2013 Data Design Diabetes Innovation Challenge – Prove It! http://www.healthmetricsandevaluation.org/tools/data-visualization/diabetes-prevalence-county-us-maps#/overview/explore

  31. Research Articles My Note: Research Article. http://www.healthmetricsandevaluation.org/tools/data-visualization/diabetes-prevalence-county-us-maps#/publications-presentations/publications

  32. Research Articles My Note: Included this in the Knowledge Base. http://www.pophealthmetrics.com/content/8/1/26

  33. Datasets My Note: Downloaded this dataset. http://www.healthmetricsandevaluation.org/publications/summaries/novel-framework-validating-and-applying-standardized-small-area-measurement-s#/data-methods

  34. Diabetes prevalence rates by age, sex, and county, 2008 (21KB* xls) *My Note: Actual size is 556KB. My Note: Needed to be separated into county and state. http://www.healthmetricsandevaluation.org/sites/default/files/datasets/diabetes_prevalence_by_county_rank_age_and_sex_2008_US_IHME_1010.xls http://ghdx.healthmetricsandevaluation.org/sites/ghdx/files/record-attached-files/IHME_USA_DIABETES_BY_COUNTY_2008.xls

  35. Metadata My Note: Another Excel file name, but same file. http://ghdx.healthmetricsandevaluation.org/record/united-states-diabetes-prevalence-county-2008

  36. IHME Diabetes County 2009: Spotfire Navigation and Metadata Higher Female Than Male Diabetes Prevalence Map Top 10 Counties With High Prevalence of Diabetes Data Set https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Diabetes-Spotfire

  37. IHME-GBD Mortality by Country: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire

  38. IHME-GBD Disability Factors by Health State: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire

  39. IHME-GBD Risk Factors by Region: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD-Spotfire

  40. IHME-GBD Cause of Death by Region: Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire

More Related