1 / 17

(I have no title)

(I have no title). Joe Hourclé ESSI Workshop 2010-08-02. Note : When reading this presentation at home, view with the ‘Notes’ window visible; I have talking points and other comments in there. About Me. Programmer for the Virtual Solar Observatory (VSO)

Download Presentation

(I have no title)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. (I have no title) • Joe Hourclé • ESSI Workshop 2010-08-02

  2. Note : When reading this presentation at home, view with the ‘Notes’ window visible; I have talking points and other comments in there.

  3. About Me • Programmer for the Virtual Solar Observatory (VSO) • Sysadmin & DBA for the Solar Data Analysis Center (SDAC) • Likes to complain about things • Has been working for the 18+ months on integrating SDO data into the VSO.

  4. So, the problem ... • Scientists either don’t know, or don’t care about informatics issues • We need to work with the scientists to educate them on how to make their work (data, systems, catalogs, etc) useful to as wide an audience as possible • We need to stop having every data system designed from the ground up

  5. Ignored Issues in e-Science: Collaboration, Provenance and the Ethics of Data

  6. Ignored Issues in e-Science: Collaboration, Provenance and the Ethics of Data

  7. Ignored Issues in e-Science: Collaboration, Provenance and the Ethics of Data

  8. Provenance can't be a bolt-on. It must be part of the data system from the beginning of the mission. Otherwise, people can cast doubt in the data to refute research they don't like. • Uncertainties in some data are not straightforward to include in data files. Software should be seen as an alternative source of uncertainty information • It is impossible to tell in detail exactly how the data was produced. What assumptions were made, what artifacts introduced, what the absolute accuracy is. • In sensor networks – need annotation of when sensors are swapped out or other discontinuities.

  9. How you describe / document time series data is fundamentally different from images & spectra – Collections are hard to define when there isn't a synoptic campaign. • Software engineering point of view for data :

  10. Software EngineeringPoint of View for Data

  11. Need ways to measure how interoperable systems are; types of interop and levels of compliance. • IRL : Interoperability Readiness Levels. Join the NASA Tech Infusion Working Group. • IPY is working on a cookbook.

  12. Create reward systems for scientists that reward re-usability. (see Townhall Thurs evening) • Different users have different requirements – do you cater to the general user or all specific cases. Quick search vs. advanced search. • How do we determine the value of data? Increase in data value if we can reduce uncertainty or increase interop with other data. • Scale of software – when do you need to bring in a programmer, or a whole team to make it a full project?

  13. (suggestion) YourBadData.org – name and shame the problem data sets. • Need automatization methos [sic] to process Nexrad data products by extracting only certain grids from a time data series of files, by geographic coordinate and/or location transformation files to readible formats. txt, shp, ... • Author identities – using pseudonyms to publish fringe work (blogs) ... might later want to merge identities, or might try to disassociate them when trying to get a new job.

  14. Conclusion • We need to raise the informatics issues in ways that the scientists care about • They care about error bars; how can we improve their error tracking? • We need simple guidelines / best practices for good data systems • We need data & system specialists as stakeholders on new data system projects

  15. joseph.a.hourcle@nasa.gov

More Related