280 likes | 293 Views
This workshop explores the challenges and opportunities in delivering forest-related information to non-foresters, highlighting case studies and discussing ways to ensure optimal use of available data. It also addresses issues of intervention, explanation, and training that fall on the library and information sector.
E N D
Expertise for the non-specialist: delivering forest-related information to non-foresters Chair and organiser: Roger Mills, Oxford University Library Services Co-ordinator, IUFRO Research Group 6.03.00 Information Services and Knowledge Organization
Forest Research • Projects over many decades have produced a wealth of data, published and unpublished • Now finding uses in other disciplines • environmental management • climate change assessment • biodiversity conservation • economic planning • economics • politics • social science • law • Easy to access with modern technologies • data frequently needs processing or harmonisation to make it usable • Raises many issues of intervention, explanation and training which fall partly or wholly on the library and information sector
Today’s workshop • Highlight some of the issues • Present case studies • Discuss what we can do to ensure that users unfamiliar with the forestry subject area can make best use of available data • Make a ‘wish list’ for future action – in IUFRO, IAALD, other fora
Trees grow slowly • Not like cabbages – generations needed for controlled study • No equivalent to Rothamsted experiments – started in 1843 and still going • Majority of forest studies carried out for a particular end and data collection not primary purpose
Data gathering • Traditionally: • Field trials • Gather data • Analyse on paper • Publish conclusions • Data stays in a drawer
Early computing • Data on tapes, punched cards etc • Physically managed by central computing units • Data preserved though may not be fully catalogued or readable long term
Modern computing • Gathered on portable devices • Analysed on PC • Stored on removable media • No central responsibility, existence known only to researcher • Unknown, unreachable, unreadable • So data is recompiled
Forest data • Time dependent, not repeatable • Time series important: significant variations may occur over relatively short periods • Essential to preserve all historical data we can
Impact of web • Preserving data in a mediated library allows delivery with health warnings • Make it web-accessible leaves open to misinterpretation • But harmonised data useful in many non-forestry contexts • Problem lies in the harmonisation
DBH • Diameter at Breast Height • How high is your breast? • 1.3m (4’3”) (USA etc) • 1.4m (4’6”) (UK etc) • 1.5m (for ornamental trees). • Decimal conversions also introduce variations: 4’6” is more accurately 1.37m.
A little knowledge is a dangerous thing • Adding stats for DBH from different areas without conversion will be misleading • Can lead to bad decision making • Eg in climatology, basing estimates of carbon incorporation on forest volume
What’s that got to do with librarians? • Aim to make data readily available to all who can use it, without restriction or censorship • Internet helps, but aids unintentional – or intentional – misuse • Answer: better metadata and user education
GFIS • Data harmonization originally an aim of Global Forest Information Service • Not achieved because of manpower required to generate extra metadata defining conversion requirements, or just warning of incompatibilities • Most data not compiled for international use, no funding to provide metadata at source
EU to the rescue • 1989 regulations to set up European forest and Communication System • “well-structured and relaiable forest information at European level” • NEFIS: Network for a European Forest Information Service 2003-5 • http://www.efi.int/portal/project/nefis
Into operation • European Forest Information and Communication Platform (EFICP) • http://eficp-info.jrc.it/ • Long gestation common • Political requirement • Development of prototype • Study problems • Development of production system • Now 19 years since original Regulation
Use it or lose it • Communicate existence of system • Make it easy to use and reliable • Must save user’s time • NEFIS project illuminates problems • Many relate to librarians’ traditionbal expertise • Terminology • Classification • Quality assessment • Searchability • Interoperability • High-quality metadata
Iterative development • Distribute technology favours new uses/users for existing data • Infrastructure needs: • Advanced spatio-temporal data collection and information management • Dissemination and fusion of heterogeneous distributed information • Sophisticated analysis, modeling and visualization of information • Designed to outlive current software
Cf Bioinformatics • Single information system holds: • Sequencing data • Tools for annotation • Tools for analysis • Publications resulting from analysis • E.g. NCBI http://www.ncbi.nlm.nih.gov/
An integrated system for forestry? • Much wider variety of data types • Much wider community of users • And of technical infrastructure • NCBI model bridges data acquisition, analysis and curation • Publishing models increasingly incorporate raw data source with peer-reviewed research
Publishing data • Author complies dataset containing forest cover statistics spanning multiple jurisdictions and century-long time series • Data acquisition and harmonisation methods recorded in metadata • Publishes package so data remains available long-term for use or further analysis by others, retrievable alongside journal articels
Open Access • Non-subscription environment to ensure wide availability • Requires new approach to resaerch funding • And long-term funding for data curation • That role likely to fall on library community • Business and technical expertise in archiving • Developing and supporting integration and interoperability tools • Online repositories
Developing standards • NEFIS datasets too different to achieve interoperability • Demonstrated need • EU European Interoperability Framework 2004 • Technical • Semantic [precise meaning] • Organizational • Last two most challenging
Semantic interoperability • Descriptive metadata • Controlled vocabularies • Ontologies • User-nominated terms – requires editor • Tagging • Quality • Accuracy • Logical consistency • Completeness • Positional accuracy • Lineage • Non-censorious indication – ‘quality report’
Data location • Provider’s server • Or central? • If local, owner responsible for metadata management • Interoperability requires metadata on: • Protocols for query translation • Mapping of filed labels • Field contents • Backround information • Associated files • Realed IPR • Required executables • Language and character set • Access control mechanisms • Standards to be agreed so all new compilations and reloaded legacy data have this information
NEFIS Demonstrator • No data harmonization • Showed feasability of retrieving and analysing data for a single request to multiple servers in multiple countries • Comprises: • Resource discovery toolkit – searches metadata • Remote search demonstrator – managing data retrieval form multiple sources • Visualisation toolkit (VTK) – naïve and expert modelling of retrieved data
EDA • Exploratory Data Analysis • Unbiased examination of data to detect patterns, trends, relationships rather than answer preconceived question • Mirrors bioinformatics approach • NEFIS data specially prepared • Adoption of common standards could allow development of VTK with no need for human intervention in preparing data
Librarians are key • In: • Curating data • Developing and supporting implementation of standards • Ensuring ready access to data • Promoting use • Universal Data Control – UDC… • It’s classification, Captain, but not as we know it… or maybe it is! • So let’s do it….