160 likes | 406 Views
CDI Controlled Vocabularies. Roy Lowry, Karen Vickers (BODC) Michele Fichaut, Catherine Maillard (SISMER) Reinhard Schwabe (DOD) 4 June 2003. Objective. To provide vocabularies to describe what was measured Used to restrict CDI search hit count
E N D
CDI Controlled Vocabularies Roy Lowry, Karen Vickers (BODC) Michele Fichaut, Catherine Maillard (SISMER) Reinhard Schwabe (DOD) 4 June 2003
Objective • To provide vocabularies to describe what was measured • Used to restrict CDI search hit count • Vocabulary dynamically generated from existing data/metadata systems • Therefore, bottom-up design rather than top-down
Scope • CDI requires three vocabularies • Platform • Instrument • Parameter • Platform and instrument vocabularies developed by DOD • Parameter vocabulary developed by BODC and SISMER
Platform Vocabulary • GF3 did a fairly good job (except grids!) • Vocabulary based on this • How is this going to be distributed and/or maintained?
Instrument Vocabulary • Vocabulary describes either sample collection or in-situ measuring technique • Compatibility with ROSCOP taken into account • I think we now have an agreed list • Again, how is this to be maintained and distributed?
Parameter Vocabulary • Strategy to develop a set of parameter groups derived from data file parameter codes • Started calling these ‘keywords’ but the word implies a ‘top-down’ design approach • Settled on the name ‘Agreed Parameter Groupings’
Parameter Vocabulary • Initial APG set based on BODC and SISMER dictionaries • Parameter count in each group kept as uniform as possible • Facilitates a list box interface • Almost succeeded but species-linked parameters need further work • Further development possible with current groupings operational
Parameter Vocabulary • 36 groupings mapped to the disciplines: • Biology • Chemistry • Physical oceanography • Geology and geophysics • Meteorology and atmospheric chemistry • Multidisciplinary • Discipline independent • Discipline indicated by first byte of code
Parameter Vocabulary • The groupings are: • Biology • B005 Bacteria and viruses • B015 Birds, mammals and reptiles • B020 Fish • B025 Microzooplankton • B027 Other biological measurements • B030 Phytoplankton • B035 Pigments • B040 Zoobenthos • B045 Zooplankton
Parameter Vocabulary • The groupings are: • Chemistry • C003 Amino acids • C005 Carbon, nitrogen and phosphorus • C010 Carbonate system • C015 Dissolved gases • C017 Fatty acids • C020 Halocarbons (including freons) • C025 Hydrocarbons • C030 Isotopes • C035 Metal concentrations • C040 Nutrients • C045 Other inorganic chemical measurements • C050 Other organic chemical measurements • C055 PCBs and organic micropollutants
Parameter Vocabulary • The groupings are: • Physical oceanography • D005 Acoustics • D010 Currents, sea level and waves • D015 Optical properties • D020 Other physical oceanographic measurements • D025 Sea temperature and salinity • Geology and geophysics • G005 Gravity, magnetics and bathymetry • G010 Sediment properties • G012 Sonar and seismics • G015 Suspended particulate matter
Parameter Vocabulary • The groupings are: • Meteorology and Atmospheric Chemistry • M005 Atmospheric chemistry • M010 Meteorology • Multidisciplinary • O005 Fluxes • O010 Rate measurements (including production, excretion and grazing) • Discipline independent • Z005 Administration and dimensions
APG Implementation • Groupings incorporated in BODC Oracle dictionary • Dynamic web interface including plain text descriptions to assist group mappings • System is fully dynamic
Problems • Biological entity properties • Needs further subdivision • Further work once BODC dictionary has been mapped to ITIS • Atmospheric chemistry • Very uncomfortable about this • Think through mappings of atmospheric pCO2 • Rename as ‘Other atmospheric gases’ and map to chemistry? • Further work as BODC/BADC develop common controlled vocabulary for NERC Data Grid
Problems • Grouping codes • Having discipline defined by first byte is a problem • Remapping a grouping between disciplines (e.g. chemistry to multidisciplinary) involves recoding • Recoding is an accident waiting to happen • Can we drop this rule and manage mapping/ordering through explicit fields?
Problems • Multidisciplinary • This is a ‘catch-all’ that dilutes search effectiveness • Necessary because discipline to APG mapping is simple one to many • Could be replaced by a many to many mapping • Implications need to be considered for non-BODC systems and CDI interface design