200 likes | 315 Views
EOL and DwC -Archives. Patrick Leary pleary@eol.org. Brief Background. Darwin Core ratified by TDWG October 2009 Consists of a vocabulary of terms M ultiple representations in XML, RDF Documentation includes Text Guide Text archives called Darwin Core Archives. DwC - Archive Structure.
E N D
EOL and DwC-Archives Patrick Leary pleary@eol.org
Brief Background • Darwin Core ratified by TDWG October 2009 • Consists of a vocabulary of terms • Multiple representations in XML, RDF • Documentation includes Text Guide • Text archives called Darwin Core Archives
DwC-Archive Structure source: http://www.gbif.org/resources/2554
Meta File source: http://yuml.me/
Validating • fileType has dateFormat attribute • DD-MM-YYYY, MM-DD-YYYY • fieldcannot specify data type to expect • field has vocabulary attribute • URI for a vocabulary; should be machine readable • Uncertain the format of the vocabulary • Recommendations: • dataType attribute to field (string, float, integer, date, boolean, uri) • values, optionalValues attribute; delimited choices
Handling Multiple Values • Some DwC terms recommend multiple values • 10% of all terms suggest “A list (concatenated and separated)” • DwC nor Archive meta file specify delimiter • Recommendations: • multiValueDelimiterattribute to field • allowsMultiValue attribute to field
Original Meta File source: http://yuml.me/
If Recommendations Were Applied source: http://yuml.me/
DwC-Archive Structure source: http://www.gbif.org/resources/2554
EOL Partial Data Model source: http://yuml.me/
Adding Structured Data source: http://yuml.me/
Extending Extensions • core can have extensions • extensions do not have to be linked to core • index attribute of coreid is optional • extensions have no explicit id • extensions cannot be linked to each other
Possible Workarounds • Flatten and repeat data • works for non-structured extension data • don’t want to end up with JSON values • Create multiple archives • Create multiple meta files • Modify the structure of the meta file • Create alternate meta file • Modify the meta file XSD
Changing Meta File • Minimal change • Add idelement to extension / fileType • Add extensionid element to fileType • With attributes rowType • Possibly some indication of hasMany • Larger change • Unify core and extension • Change coreid accordingly
Original Meta File source: http://yuml.me/
Diagram of minimal change source: http://yuml.me/
Diagram of larger change source: http://yuml.me/
Summary of Recommendations • dataType attribute to field (string, float, integer, date, boolean, uri) • values, optionalValuesattributes to field • multiValueDelimiter attribute to field • allowsMultiValue attribute to field
Open Questions • Are these recommendations worth pursuing? • How to proceed with extending extensions? • How to update Darwin Core Text Guide with respect to Darwin Core terms? • Should Darwin Core Text Guide be separated? • Should meta file schema be separated from Text Guide?
Thank You Patrick Leary pleary@eol.org