600 likes | 826 Views
AVH - Australia’s Virtual Herbarium Logo. Jim Croft Centre for Plant Biodiversity Research Australian National Herbarium. Australia’s Virtual Herbarium: storing and interchanging botanical data on-line. Jim Croft Centre for Plant Biodiversity Research Australian National Herbarium
E N D
AVH - Australia’s Virtual Herbarium Logo Jim Croft Centre for Plant Biodiversity Research Australian National Herbarium
Australia’s Virtual Herbarium:storing and interchanging botanical data on-line Jim Croft Centre for Plant Biodiversity Research Australian National Herbarium jrc@anbg.gov.au http://www.anbg.gov.au/jrc/
Storage Industry speak • Role of Industry / Cost of storage infrastructure relative to project / Business competetiveness / Strategic business advantage / Justify investment / Return on Information (ROI) / Ever-changing environment / Interoperability / Linking geographically dispersed sites/ Availability all the time, everywhere, forever / Storage strategy / Effective storage management / Right storage architecture / Faster storage environment / Advanced database technologies / Optimising availability, performance, placement, recovery / Flexible future-proof IT infrastructure / Data storage to assist R&D / Replication for disaster recovery / Benefits of central disaster recovery / Necessity to plan ahead / Staged implementation approach / Lead organizational change
Herbarium botany speak • Herbarium / Plants / Regional floristics / Preserved botanical specimens / Taxonomic hierarchy and taxon ranks / Nomenclature, synonymy / Alternative phylogenies and classifications / International Code of Botanical Nomencalture / Identification keys / Original, derived and modifided data / Localities, geocodes / Data accuracy and precision / Habitat, altitude, depth, substrate / Biological images / Geospatial modeling and visualization / Environmental modeling and prediction / Species distributions and occurrence / Biological descriptive frameworks / Interactive identification / Flora Information systems / Landcover species / Curatorial standards / Cryptogams / Gymnosperms /Platyzomataceae / Eucalyptus camaldulensis
AVH - The Big Questions The 6 Ws: Who? What Where? When? Why? hoW?
AVH - The Big Questions What is the AVH? Why should the AVH happen? Where does the AVH happen? Who does the AVH happen for? When does the AVH happen? hoW does the AVH happen? Whence the AVH?
What is a Herbarium? • A physically and administratively secure building • A managed archival scientific collection of preserved plant specimens • A research environment and resource for botanical systematic and taxonomic resource • A taxonomic, spatial and temporal information base for botanical research, environmental decision-making and public information
What is a Virtual Herbarium? • The physical resources and biological information of a herbarium represented digitally • On-line access to herbaria and to botanical information managed by herbaria • Integrated access to botanical information from various sources in a herbarium and other on-line botanical information
What is the AVH? • A collaborative project of the Australian Herbarium community, providing: • Partnership and shared access to each others data • Real-time access to current working data • Shared access to common authority files • A shared development environment • Opportunity to shared data-hosting, archiving and off-site backup. • Co-ownership of the final product
Acacia aneura: Distribution of specimens from each herbarium
Geocode accuracy Survey data
Why is there an AVH? • Pressure on Herbaria to work more efficiently • Demand for access to larger amounts of data • Demand to access data more quickly • Demand to view data in different ways • Pressure on herbaria to be and appear more responsive to community needs
What is the Problem? • > 18,000 species of higher plants • > 64,000 available names • Extensive synonymy (4 names per plant) • 8 major government-funded herbaria • Similar number of university herbaria • > 6,500,000 specimens Aust. herbaria • 50-100 data elements per specimen • Several Kb per specimen
Where is the data? • In each herbarium (largest 1.3 million specimens) • Pooling data centrally not acceptable for operational, political and emotional reasons. • Therefore we need a distributed data management and access solution, maintaining and ensuring custodial responsibility
Where is the data? • Images compound the problem • Several Kb and up for plant images (possibly 100,000 available) • Specimen images need high resolution, up to 20 Mb or more • Need to be sub-sampled for web display • At least 100,000 type specimens • Ideally all 6.5 million should be done
Where is the AVH? • Spread across Australian herbaria • Data distributed; resides with custodians • Each herbarium has a portal to receive requests to and deliver data from its database • Each herbarium hosts a common AVH query interface that polls all herbaria and integrates and returns data as a single query
Who are the participants? State Herbarium of South Australia Queensland Herbarium Australian National Herbarium Northern Territory Herbarium Tasmanian Herbarium Industry Partner: KE Software National Herbarium of Victoria National Herbarium of New South Wales Western Australian Herbarium Australian Biological Resources Study
Who runs the AVH? • The Council of Heads of Australian Herbaria (CHAH). • The Herbarium Information Systems Committee (HISCOM) • IT staff at each herbarium (technology) • Botanical staff at each herbarium (content) • Scientific staff at each herbarium (validation)
Aust. & NZ Environment & Conservation Council • Government committee of Commonwealth and State/Territory Environment Ministers • Accepted that the community wanted the product • Funding options and regional support • Working group • Project design input - new name
“The Agreement” • $10 million project over five years • Capture new data and validate old • State/Territory to contribute amount relative to specimens to be databased/validated • $4 million Commonwealth + $4 million State/Territory + $2 million private • Sharing data critical to cost (cf. $16 million)
Who uses the AVH? • The participating herbaria get access to all the data at the highest precision. • Public access filter restricts access to work in progress, sensitive locality data, etc. • Access to conservation agencies, environmental decision makers • Research and education • Public general interest
When did the AVH happen? • Basically this year • But we have been working towards it for over 12 years • And there have been the occasional dead ends and setbacks, waiting for technology, capacity, support, etc.
Brief History of the AVH • 1995 - HISCOM recommends the AVH concept (a distributed database) to CHAH • 1997 - Canvassed at Systematics meeting • 1999 - Proof of concept with Acacia • 2000 - Government Minister shows interest • 2000 - Interest from industry/foundations • 2000 - Negotiating cost & lobbying
Recent Activity • Major item at October CHAH meeting- Agreement on what information we provide to community - Priority groups and ‘Who does what?’ • Trust to oversee financial arrangements • Liaison and Advisory Committee
Evolution of the AVH Need for common semantic schema recognized Standard syntax Race to database HISPID Botanical ontology? Need for semantic standard recognized Exchange Distributed query
hoW does the AVH work? • On a number of different levels • Politically • Administratively • Technically • Scientifically
Standards URL XSL T XPATH RDF XML SVG BNF ITF UML UDDI Z39.50 URI XHTML SOAP Dublin Core Z39.19 DOM HTTP RDFS PNG HISPID ASN.1 SAX CSS WAIS XML schema RMI cgi
Whence the AVH? • A new era of integrated access to botanical information • New ways of visualizing data form different sources • New ways on managing and validating data across remote databases • More automation, more speed, higher throughput
Added extras - the real AVH • Stage 1: databasing (dots on maps) • Plus map overlays, precision flags, spatial queries, pretty interfaces, etc. • Conflicting taxonomies - towards a National Census • Stage 2+: images, descriptions, identification tools • Multiple resources and options (cf. library)