170 likes | 328 Views
CRMsci. CRM sci : the Scientific Observation Model. Martin Doerr, Chryssoula Bekiari , Athina Kritsotaki, Gerald Hiebel, Maria Theodoridou. Center for Cultural Informatics, Institute of Computer Science Foundation for Research and Technology - Hellas. CIDOC 2014
E N D
CRMsci CRMsci: the Scientific Observation Model Martin Doerr, Chryssoula Bekiari, Athina Kritsotaki, Gerald Hiebel, Maria Theodoridou Center for Cultural Informatics, Institute of Computer Science Foundation for Research and Technology - Hellas CIDOC 2014 Dresden, September 9th, 2014
Current situation • EU infrastructure projects aim to publish linked open data about scientific observations in geology, biology, archaelogical excavations, digital productions and medicine • Existing standards for scientific observation • INSPIRE –earth science oriented promoted by EU • OBOE – life science oriented, support semantic annotation • SEEK – ecology oriented - framework • Darwin Core – a general use metadata scheme for biodiversity Focus on : • semantic annotation process of data sets
Epistemological Considerations • Theories are formalized sets of concepts that organize observations and predict and explain phenomena and demand a solid empirical base of evidence • Raw data provided by the data sets per se are of little use • Scientific observation forms the basis for understanding the phenomena being studied and it is a process by which we advance our understanding of the world. • It common to all sciences the workflow of forming of a hypothesis to perform and explain observations that are made, the gathering of data, and the drawing of conclusions that confirm or deny the original hypothesis. • The difference between the types of sciences is in what is considered data, and how data is gathered and processed • The cultural discourse includes information from all sorts of sciences and product of sciences, i.e. digital productions, biological samples, specimen of physical objects (materials, fluids etc.). • Scientific data and metadata can be considered as historical records.
Common Workflow • Form of a hypothesis to perform an observation (select parameters, properties, signals and the way of converting these to data) • Perform the observations. (They are only concerned with objects or events that are observable, either directly or indirectly ) • Explain the observations made and the gathering of data • Draw conclusions based upon this data, (make a scientific hypothesis - tentative explanations about the observations made) • Deduce the implications (test them through further observation, compare the results) • Confirm, deny, re-evaluate the original hypothesis • Formulate valid theories (allow others to repeat the observations)
Limitations Problems with the existing standards: • They model observation isolated from other actions that are preceding or following an observation event, • They leave out information that would allow for later assessing, the quality and precision of the results or for re-evaluating existing measured data due to new evidence which would not require redoing the measurement itself, if suitable raw data were provided. • Even though they are using the above standards to publish data in repositories, they typically lack the required information to facilitate effective long-term preservation and interpretation of data.
The CRMsci – overview(1) • has been developed bottom up from specific metadata examples such as water sampling in aquifer systems, earthquake shock recordings, landslides, excavation processes, species occurrence and detection of new species, tissue sampling in cancer research, 3D digitization, • takes into account relevant standards, such as INSPIRE,OBOE, Darwin Core, national archaeological standards for excavation, Digital Provenance models and others. • describes, together with the CIDOC CRM, a discipline neutral level of genericity, which can be used as a general ontology of human activity, things and events happening in spacetime • uses the same encoding-neutral formalism of knowledge representation as the CIDOC CRM, and can be implemented in RDFS, OWL, on RDBMS and in other forms of encoding • reuses, wherever appropriate, parts of CIDOC CRM, we consider as part of this model all constructs used from ISO21127, together with their definitions following the version 5.1.2 maintained by CIDOC.
The CRMsci – overview (2) • Metadata about: • The human observer • The objectof observation (a “thing”, “something”, a process or a state?), • The observation hypothesis(choice of parameters), • The identityof the object, if any, • The environment, time and location • The conditionof the thing, • The instrumentation and methodused • The identity, authenticity and transmission of the produced records • The inference making
Events and Activities E5 Event E7 Activity E13 Attribute Assignment S18 Alteration E63 Beginning of Existence S5 Inference Making S4 Observation S17 Physical Genesis S6 Data Evaluation S8 Categorical Hypothesis Building E11 Modification S7 Simulation-Prediction S1 Matter Removal E12 Production E16/S21 Measurement E80 Part Removal S2 Sample Taking S40 Encounter Event S3 Measurement by Sampling 8
Observable Entity E1 CRM Entity …comprises items(E77) or phenomena (E2) that can be observed such as physical things, their behavior, states and interactions or events, either directly by human sensory impression, or enhanced with tools and measurement devices,. Inspired by OBOE S15 Observable Entity E2 Temporal Entity E77 Persistent Item E70 Thing S16 State E5 Event S10 Material Substantial E3 Condition State E53 Place E18 Physical Thing S14 Fluid Body S11 Amount of Matter E55 Type S20 / E26 Physical Feature S12 Amount of Fluid S13 Sample S9 Property Type E25 Man-Made Feature E27 Site S22 Segment of Matter
Matter Removing and Sampling S19 Observable Entity E7 Activity E2 Temporal Entity E77 Persistent Item E55 Type S1 Matter Removal E70 Thing O20 sampled from type of part O1 diminished E3 Condition State E57 Material P46 is composed of P45 consists of P44 has condition O3 sampled from S2 Sample Taking S10 Material Substantial O2 removed O4 sampled at O5 removed E53 Place E18 Physical Thing S11 Amount of Matter S14 Fluid Body O7 contains or confines P156 occupies S13 Sample O15 occupied
Monitoring observation activities E7 Activity P2 has type E13 Attribute Assignment E55 Type S5 Inference Making S4 Observation O10 observed O11 observedProperty S9 Property Type S15 Observable Entity O16 described S6 Data Evaluation P39 measured S19 Encounter Event E5 Event E16 Measurement O14 assigned dimension O17 has dimension E70 Thing P40 observed dimension E54 Dimension S10 Material Substantial O32 has found object E18 Physical Thing
S19 Encounter Event E18 Physical ThingSphaero-levantina-003 O32 has found object Inspired by Darwin Core E21 Person Sarah Faulwetter E53 PlaceIsrael O7 contains or confines (is contained or confined) P14 carried out by S19 Encounter Event urn:catalog:IOL:POLY:Sphaerosyllis-levantina-ALA-IL-7-Oct.2009 E53 Place Haifa Bay Ecosystem Station 1 O21 has found at(witnessed) E55 Type P2 has type P4 has timespan Ecosystem Typesandy - muddy sediments E52 Timespan 7 October 2009 P125 used object of type P127 has broader term Equipment TypeWA265/SS214 Equipment TypeVan Veen Grab
S5 Inference Making E1 CRM Entity comprises the action of making propositions and statements about particular states of affairs in reality or in possible realities or categorical descriptions of reality by using inferences from other statements based on hypotheses and any form of formal or informal logic. P16 usedspecificobject (wasusedfor) E70 Thing P15was influenced by (influenced ) P17wasmotivatedby (motivated) P33 used specific technique (was used by) E7 Activity E29 Design or Procedure E13 Attribute Assignment 010 Assigned dimension (dimension was assigned by) E54 Dimension S5 Inference Making S6 Data Evaluation 011 described ( was described by) S19 Observable Entity S8 Categorical Hypothesis Building concluding propositions on a respective reality from observational data by making evaluations based on mathematical inference rules and calculations using established hypotheses assumptions developed by “induction” from finite numbers of observation of particular thing. Based on inference rules and theory S7 Simulation-Prediction executing algorithms or software for simulating the reality or not by using mathematical models 13
Applications • Informed by the IAM model (argumentation) • EU FP7 - PSPInGeoClouds • European Space Agency: satellite data • EU FP7-INFRASTRUCTURES-2012-1ARIADNE • Supermodel for CRMarchaeo • EU - FP7 - CP & CSA iMarine • Informs and complements MarineTLO • Extended MarineTLO used in LifeWatch Greece, being promoted to LifeWatch
Conclusions • Our aim is : • to open the discussions in CIDOC about subjects concerning the conceptual modelling about products of human activities. • to suggest to CIDOC to approve that modelling scientific activities is a valid scope for CIDOC and could be a working item for the CRM-SIG WG • Needed: Still to be done: Specializations into analytical methods and reference data sets • Links: http://www.ics.forth.gr/isl/CRMext/CRMsci.rdfs