1 / 15

Challenges facing data-enabled interdisciplinary training

Challenges facing data-enabled interdisciplinary training. What is DESE?. If your science and engineering is not data enabled… …you’re not doing it right. http:// drewconway.com / zia /2013/3/26/the-data-science- venn -diagram. Big Data in Agriculture (Today).

balley
Download Presentation

Challenges facing data-enabled interdisciplinary training

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Challenges facing data-enabled interdisciplinary training

  2. What is DESE? • If your science and engineering is not data enabled… • …you’re not doing it right.

  3. http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagramhttp://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

  4. Big Data in Agriculture (Today) • Syngenta Challenge: What seed varieties to plant? • Consider expected weather conditions, knowledge about the soil at their farms, and performance studies of candidate soybean varieties from numerous sources.

  5. Tomorrow • Problem becomes Gene (60K) X Environment(?) X Phenotype (thousands) • G X E = P • Visualize results so a farmer can understand, actionable intelligence

  6. VELOCITY • Up to thousands of respondents at one time • With each choice, back end must update embedding and deliver new query without noticeable delay for user • Serious data-handling and infrastructure design challenge Try it! http://nextml.org/chemistry

  7. Variety

  8. Leveling the playing field • Everyone comes in with different skills and tool sets. • How do we get each discipline “up to speed”on critical data-science skills… • …without requiring extensive additional coursework / time to degree?

  9. One-way street problem • Students in computer science and engineering have data-science skills that apply broadly… • …but “apply skillset A to dataset B” != cross-disciplinary science. • What will engage interest from both computational and applied sides to promote true interactions?

  10. Tower of Babel 1 • Data science tools and standards vary considerably across disciplines and even across labs… • …yet for students to interact, a common set of tools is required. • How do such standards get set, and what should they be?

  11. Tower of Babel 2 • Each disciplines has its own jargon, which can be efficient within discipline but a barrier across disciplines. • Talks are hard to follow when (a) you can’t understand the terms and when (b) you have to stop to explain every third word. • How do we promote shared language for data science?

  12. Data science infrastructure • Means of collecting, sharing, documenting data are proliferating. • Esp with big data, issues arise: • Privacy, data sharing, large data sets, documentation of data, etc. • What are the right tools and infrastructure for managing data storage, documentation, access, etc? • Open science? Amazon? Wiki? Github? Slack? WordPress?

  13. Plan • Small group breakout 1 (15min): Elaborate and rank order list of challenges • What are we missing? Add any additional challenges to Google Doc • What is most pressing? Rank order listed challenges (last 2 min) • Report back (15 min): Which challenge was your table’s top ranked and why? • Small group breakout 2 (15 min): Top n challenges assigned to tables—regroup at a table that interests you and discuss solutions • Note solutions on Google doc • Report back: What are your table’s solutions?

  14. These are questions for you! • Teaching basic data science to students who are not in quantitative areas. • What basic skills should scientists have to at least get started? • How should these skills be taught? • How do we promote true interdisciplinary collaboration, rather than partitioning tasks by discipline? • How do we balance the utility of jargon versus its alienating effects? How do we best promote good communication from data-science to discipline? • How do we manage variety and promoting standards in software use and development. • How do we build infrastructure for big data sharing, security, and documentation

More Related