1 / 29

The Rise of Informatics as-a Research Domain

The Rise of Informatics as-a Research Domain. WIRADA Science Symposium August 2, 2011, Melbourne. Peter Fox (RPI and WHOI) pfox@cs.rpi.edu Tetherless World Constellation. What ’ s ahead (today). Do you need motivation? If so - Data Science and Informatics An example

cale
Download Presentation

The Rise of Informatics as-a Research Domain

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Rise of Informatics as-a Research Domain WIRADA Science Symposium August 2, 2011, Melbourne Peter Fox (RPI and WHOI) pfox@cs.rpi.edu Tetherless World Constellation

  2. What’s ahead (today) • Do you need motivation? • If so - Data Science and Informatics • An example • Rising = maturity = repeating it – from technology to methodology • Use cases, information models and more … • Research topics • Where is informatics rising to? Tetherless World Constellation

  3. Working premise Scientists – actually ANYONE - should be able to access a global, distributed knowledge base of scientific data that: • appears to be integrated • appears to be locally available Data – volume, complexity, mode, scale, heterogeneity, …

  4. Mind the Gap! • There is/ was still a gap between science and the underlying infrastructure and technology that is available • Informatics - information science includes the science of (data and) information, the practice of information processing, and the engineering of information systems. Informatics studies the structure, behavior, and interactions of natural and artificial systems that store, process and communicate (data and) information. It also develops its own conceptual and theoretical foundations. Since computers, individuals and organizations all process information, informatics has computational, cognitive and social aspects, including study of the social impact of information technologies. Wikipedia. • Cyberinfrastructure is the new research environment(s) that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services over the Internet.

  5. Application integration! • Smart faceted search Biological and chemical oceanography

  6. Modern informatics enables a new scale-free** framework approach • Use cases • Stakeholders • Distributed authority • Access control • Ontologies • Maintaining Identity

  7. Huh? Scale free? Citation networks, the Web, semantic networks

  8. Use Case • … is a collection of possible sequences of interactions between the system under discussion and its actors, relating to a particular goal.

  9. Real use cases:Marine habitat - change Rock Several disciplines; biology, geology, chemistry, oceanography Several applications; science, fishing, habitat change, climate and environmental change, data integration Complex inter-relations, questions Use case: What is the temperature and salinity of the water and are these marine specimens usual or part of an ecosystem change? Scallop, shell fragment Scallop, number, density Flora or fauna? What is this? Scallop, size, shape, color, place Dirt/ mud; one person’s noise is another person’s signal Src: WHOI and the HabCam group

  10. Information Modeling • Conceptual • Logical • Physical

  11. Socio-technical system(s) • Refers to the joint social and technical aspects of ‘systems’ • Sociological – people and groups of people • Technical – more than technology but the two are often conflated – of organization and process

  12. Informatics efforts: ‘These members assume well defined roles and status relationships within the context of the virtual group that may be independent of their role and status in the organization employing them’ (Ahuja et al., 1998). Technology Organizational Structure Communication Patterns

  13. Research domain • Pulling apart the data/information/ knowledge ecosystem • Capturing and representing knowledge • Closed world/ open world • Standards – a socio-technical system • What, why, how – knowledge provenance ecosystem (yes, another one) • Working with multiple information models

  14. Data-Information-Knowledge Ecosystem Producers Consumers Experience Data Information Knowledge Creation Gathering Presentation Organization Integration Conversation Context

  15. Producers Consumers Quality Control Quality Assessment Fitness for Purpose Fitness for Use Trustor Trustee Others… Others…

  16. Working with knowledge Expressivity Implementability Maintainability/ Extensibility

  17. Unit of exchange – the triple - example (linked data) Closed World Open World Heath (2009)

  18. Working with knowledge Query Inference Rule execution

  19. Expressivity/ Implementation Declarative Procedural Linked open data URI/http/RDF * Ontology encoded

  20. Standards - technical Data Systems Credit: B. Rouse (BEVO) 2008

  21. The social side User Group Credit: B. Rouse (BEVO) 2008

  22. What is the ecosystem? • Many elements, and they are scattered • But these are what enable scientists to explore/ confirm/ deny their research Accountability Identity Explanation Justification Verifiability Proof Trust ‘Provenance’ ‘Transparency’ -> Translucency

  23. Provenance • Origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility or who, what, where, why, when… • Knowledge provenance; enrich with ontologies and ontology-aware tools • Provenance presentation is a challenge

  24. Provenance Distance Computation • Based on provenance “distance”, we tell users how different data products are. • Issues: • Computing the similarity of two provenance traces is non-trivial • Factors in provenance have varied weight on how comparable results of processing are • Factors in provenance are interdependent in how they affect final results of processing • Need to characterize similarity of external (vs. internal) provenance • Dimensions/factors that affect comparability is quickly overwhelming • Not all of these dimensions are independent - most of them are correlated with each other. • Numerical studies comparing datasets can be used, when available, and where applicable to the analysis

  25. Quality, Uncertainty, Bias • Quality • Is in the eyes of the beholder – worst case scenario… or a good challenge • Uncertainty • has aspects of accuracy (how accurately the real world situation is assessed, it also includes bias) and precision (down to how many digits) • Bias has at least two aspects: • Systematic error resulting in the distortion of measurement data caused by prejudice or faulty measurement technique • A vested interest, or strongly held paradigm or condition that may skew the results of sampling, measuring, or reporting the findings of a quality assessment: • Psychological: for example, when data providers audit their own data, they usually have a bias to overstate its quality. • Sampling: Sampling procedures that result in a sample that is not truly representative of the population sampled. (Larry English) • Semantics – all about meaning in context (see diagram!) • Provenance = enabler but knowledge provenance = transformative

  26. Information models

  27. Integrating, mediating… • At the conceptual level and under an open world assumption Conceptual modeling ontology (McCusker et al. 2011) -> bridging properties to SKOS, IAO, ..

  28. Where to? • Balancing research and application • Increase emphasis and presence in educational organizations • Confront the differences in incentives and inhibitions in different disciplines • Further develop peer communities and organizations • Journal impact factors have to go up • Explore the shift into open-world semantics and data frameworks

  29. Thanks… Questions? • @taswegian • pfox@cs.rpi.edu • http://tw.rpi.edu

More Related