350 likes | 540 Views
T HE US N ATIONAL V IRTUAL O BSERVATORY. Publishing and Resource Discovery with Registries. Kevin Benson Sebastien Derriere Pierre Fernique Matthew Graham Gretchen Greene. Bob Hanisch Paul Harrison Martin Hill Jeongin Lee Gerard Lemson. Tony Linde Tom McGlynn Wil O’Mullane
E N D
THE US NATIONAL VIRTUAL OBSERVATORY Publishing and Resource Discovery with Registries Kevin Benson Sebastien Derriere Pierre Fernique Matthew Graham Gretchen Greene Bob Hanisch Paul Harrison Martin Hill Jeongin Lee Gerard Lemson Tony Linde Tom McGlynn Wil O’Mullane Keith Noddle Ramon Williamson Ray Plante Visit the NVO Demo Booth ADASS 2004 - Pasadena
Summary (2003) • We built a working prototype registry system to support an end-user VO service • Distributed Publishing and Searchable components • Encoded descriptions using emerging VO XML standard schemas • OAI Harvesting Standard deployed easily • Used to discover Cone Search and SIA services • What’s next: Interoperable registries IVOA-wide • Stablize XML metadata standard • Standardize registry interfaces ADASS 2004 - Pasadena
Summary (2004) • We built a working production registry system to support an end-user VO services • DataScope: discovers Cone Search, Simple Image Access services • OpenSkyQuery Portal: discovers OpenSkyNodes • What’s next: Interoperable registries IVOA-wide • Stabilize XML metadata standard • Standardize registry interfaces => IVOA: Frozen working draft standard for January ’05 releases ADASS 2004 - Pasadena
Registries 2004 • Review of Registry architecture • Resource Metadata Model • IVOA Registry Interface Standard • Harvesting • Searching • The NVO Publishing Process • Searching for Resources • Curation Issues ADASS 2004 - Pasadena
The role of Resource Registries • Used to discover and locate resources—data and services—that can be used in a VO application • Resource: anything that is describable and identifiable. • Besides data and services: organizations, projects, software, … • Presently concerned with simple set of resource types • Registry: a list of resource descriptions • Expressed as structured metadata to enable automated processing and searching ADASS 2004 - Pasadena
Selected Requirements • Allow user to select resources that are likely to pertain to a scientific question • Select resources based on characteristics… • Type of resource: catalogs, image archives, EPO, services • Coverage in space, time, and frequency • Where data comes from, who curates it • Dynamic: resources will come and go • Distributed: Should not depend on a single point of failure or single view of the VO. • Preserve the data providers’ control over their data • Curators control what gets registered, content, updates • Allow integration with existing resource management • Allow extension to new types of resources ADASS 2004 - Pasadena
IVOA Registry Working Group (RWG) IVOA = International Virtual Observatory Alliance • Common, global approach to registries • Towards a standard framework • Registry Model • Resource Identifiers • Metadata schemas • Registry Interface • Distributed model for registries ADASS 2004 - Pasadena
Full Searchable Registry Local Publishing Registry Local Publishing Registry Registry Model VO Projects Full Searchable Registry Data Centers Local Searchable Registry Specialized Portals & Services ADASS 2004 - Pasadena
Full Searchable Registry Local Publishing Registry Local Publishing Registry Registry Model VO Projects harvest (pull) Full Searchable Registry Data Centers Local Searchable Registry Specialized Portals & Services ADASS 2004 - Pasadena
Full Searchable Registry Local Publishing Registry Local Publishing Registry Registry Model VO Projects harvest (pull) replicate Full Searchable Registry Data Centers Local Searchable Registry Specialized Portals & Services ADASS 2004 - Pasadena
Full Searchable Registry Local Publishing Registry Local Publishing Registry Registry Model VO Projects harvest (pull) replicate Full Searchable Registry Data Centers selective harvesting Local Searchable Registry Specialized Portals & Services ADASS 2004 - Pasadena
Full Searchable Registry Local Publishing Registry Local Publishing Registry Registry Model VO Projects Full Searchable Registry Data Centers search queries Local Searchable Registry Client Applications Specialized Portals & Services ADASS 2004 - Pasadena
Full Searchable Registry Local Publishing Registry Local Publishing Registry Registry Model VO Projects Full Searchable Registry Data Centers search queries Local Searchable Registry Client Applications Specialized Portals & Services ADASS 2004 - Pasadena
Full Searchable Registry Local Publishing Registry Local Publishing Registry Registry Model VO Projects Full Searchable Registry Data Centers search queries Local Searchable Registry Client Applications Specialized Portals & Services ADASS 2004 - Pasadena
Local Publishing Registry Registries in Use:DataScope JHU/STScI Full Searchable Registry harvest (pull) Caltech Local Publishing Registry search for services NCSA DS DataScope ADASS 2004 - Pasadena
Local Publishing Registry Local Publishing Registry Local Publishing Registry Registries in Use:DataScope JHU/STScI AstroGrid Full Searchable Registry Full Searchable Registry harvest CDS (pull) Caltech Local Publishing Registry HEASARC search for services NCSA DS DataScope ADASS 2004 - Pasadena
Cone Search Service Cone Search Service Cone Search Service Local Publishing Registry Local Publishing Registry Local Publishing Registry Registries in Use:DataScope JHU/STScI AstroGrid Full Searchable Registry Full Searchable Registry harvest CDS (pull) Data Providers Caltech Simple Image Access Local Publishing Registry HEASARC Simple Image Access search for services Simple Image Access NCSA DS DataScope ADASS 2004 - Pasadena
Registries in Use • Registries in the NVO are currently operating and functional • DataScope: discovers Cone Search, Simple Image Access (SIA) services • OpenSkyQuery Portal: discovers OpenSkyNodes • CDS Aladin/GLU: (Pierre Fernique) • harvests Cone Search and SIA services • converts them into GLU dictionary records • Accessible directly by the Aladin image and catalog viewer • AstroGrid Registry foundation for building workflows • Portal uses descriptions to stitch services together • (Previous talk by Keith Noddle) • Cross-project harvesting • NVO, AstroGrid, AVO (Vizier, GLU) • Registries are at the leading edge of VO development ADASS 2004 - Pasadena
IVOA Recommendation: Resource Metadata Resource Metadata Model ADASS 2004 - Pasadena
IVOA Recommendation: Resource Metadata Resource Metadata Model Core Metadata as XML IVOA Working Draft: VOResource Resource Organisation Service ADASS 2004 - Pasadena
IVOA Recommendation: Resource Metadata Resource Metadata Model as XML IVOA Working Draft: VOResource Resource VORegistry Authority Organisation Service Registry VODataService DataCollection SkyService TabularSkyService ADASS 2004 - Pasadena
IVOA Recommendation: Resource Metadata Resource Metadata Model as XML IVOA Working Draft: VOResource Resource VORegistry Authority Organisation Service Registry VODataService DataCollection SkyService TabularSkyService ConeSearch SIA ConeSearch SimpleImageAccess ADASS 2004 - Pasadena
IVOA Recommendation: Resource Metadata Resource Metadata Model as XML IVOA Working Draft: VOResource Resource VORegistry VOCEA Authority Organisation Service CEAApplication Registry CEAService VODataService DataCollection SkyService TabularSkyService ConeSearch SIA ConeSearch SimpleImageAccess ADASS 2004 - Pasadena
IVOA Working Draft:Registry Interface (RI) Standard Kevin Benson (AstroGrid), Editor • Harvesting Delivering resource descriptions from publishers to searchable registries • Adoption of Open Archives Initiative (OAI) standard: Protocol for Metadata Harvesting http://www.openarchives.org/ • RI defines application of OAI to VO resource records • Plug in VOResource as metadata format • Optional SOAP version to augment HTTP Get standard • Searching • Returns XML VOResource records • Keyword search • Advanced search • Uses the Astronomical Dataset Query Language (ADQL) • Refer to metadata items via a simplified XPath • Easily mapped to either SQL for an RDBMS implementation, XQuery for an XML DB implementation ADASS 2004 - Pasadena
Publishing to the NVOhttp://www.us-vo.org/publish.cfm • Resources are published if one can use VO facilities to find them. • Multiple layers of publishing • Starts with registry description of resource • Data Access Services Incremental exposure for incremental effort • Who are you? How you publish depends on what you want to publish. • An individual with a small data collection • An archive center • Someone with a cool service • Extinction Correction Service • Developed by C. Miller, K. S. Krughoff • In one day of the NVO Summer School using VO tools ADASS 2004 - Pasadena
Small collections:VO-ready Repositories • Repositories that allow users to deposit data to share with community • Guarantee long-term storage, availability • Automatic support for VO publishing mechanisms • Entries into NVO Registry • Support for standard services: Cone Search, SIA, SSA, SkyNode • Currently available Repositories • Images: NCSA Astronomy Digital Image Libraryhttp://adil.ncsa.uiuc.edu/ • Spectra: Spectrum Services for the VO http://voservices.net/spectrum/ • More public repositories are expected to emerge Check NVO website (http://us-vo.org/) for latest ADASS 2004 - Pasadena
Persistent Archives:Tools for Federation • Registering your resources with a public VO publishing registry Choose resource type STScI Registry Edit Form NCSA Registry ADASS 2004 - Pasadena
Persistent Archives:Tools for Federation • Registering your resources with a VO publishing registry • Enter description into registration form at one of the available NVO registries: • STScI/JHU Registry: http://nvo.stsci.edu/voregistry/ • NCSA Registration Portal: http://nvo.ncsa.uiuc.edu/nvoregistration.html • Caltech Carnivore: http://mercury.cacr.caltech.edu:8080/carnivore/ • If you have a large number of resources to register, you can run your own registry on your own site • NCSA VORegistry-in-a-Box http://nvo.ncsa.uiuc.edu/VO/software/ • Caltech Carnivore: http://mercury.cacr.caltech.edu:8080/carnivore/ ADASS 2004 - Pasadena
Persistent Archives:Tools for Federation • What can/should you register? • Should: your Organization • Declares yourself as a publisher with an ID • Should: your Collection • Users at least know how to access it via a Browser • Can: your existing services • Browser-based services: e.g. search page • Traditional CGI services • Web Services The next level… • Implement and register one or more standard services • Cone Search • Simple Image Access • SkyNode* • Simple Spectral Access* *standard still in development • NVO Summer School Software package: server-side templates and toolkits http://www.us-vo.org/summer-school/ ADASS 2004 - Pasadena
Cool Services:Integrating with the VO • Register your service at a registry • Integrate support for standard VO formats, schemas • FITS and VOTable • Enable integration with existing tools & visualizers • Standard Data Model schemas (emerging) • VOResource, Space-time Coordinates, Spectra • Enable integration with other services using these models • Implement Standard Support Interface • a standard in development for: Self-description, tracking health and usage ADASS 2004 - Pasadena
Searching the Registry • Use a searchable registry to find data and services • NVO has two searchable registries available: • STScI/JHU Registry: http://nvo.stsci.edu/voregistry/ • Caltech Carnivore: http://mercury.cacr.caltech.edu:8080/carnivore/ • Two types of searches: • Simple keyword-based search • Advanced search • STScI/JHU: SQL-based • Caltech: XQuery-based • Currently working on user-oriented improvements to interactive interface G. Greene & W. O’Mullane @ STScI • Help with advanced searches • Improved organization of returned results ADASS 2004 - Pasadena
Accessing the Registry from Applications • Custom Web Service Interfaces available • keyword and advanced search functions • Currently used by DataScope and SkyPortal • IVOA Standard Web Service interface • Full support targeted for January 2005 roll-out • Beta support available from Caltech Carnivore • Available Java client software • Currently available via NVO Summer School software distribution • Zip file: http://chart.stsci.edu/twiki/bin/view/Main/Software • HowTos: http://chart.stsci.edu/twiki/bin/view/Main/NVOSummerSchoolCourseNotes • Includes: • Client library for IVOA Standard search interface • Sample client code for both custom and standard interfaces ADASS 2004 - Pasadena
Curation Issues • NVO Registries now contain over 3000 records • Lots of problematic metadata: • Missing information, incorrect usage, truncated values • Duplicates, deprecated records, missing resources • Broken/non-compliant services • People need to assume responsibility for curation • Software can help, but is not sufficient • Role of Registry administrator? ADASS 2004 - Pasadena
A practical approach to Curation • Proposal: “VerificationLevel” tag attached to resource descriptions by a registry curator • 3 levels: • Unverified • Verified by software • Verified by human curator • Tag exposed to users/apps: e.g. select only highly verified resources • Tag is specific to a registry; can by overridden when harvested by another registry. • Software verification • NCSA: building a suite of software verifiers • Can be incorporated directly into registries Either locally or by calling a remote web service • First example: Cone Search Verifier http://nvo.ncsa.uiuc.edu/services/csvalidate.html ADASS 2004 - Pasadena
Summary 2004 • NVO is operating production registries • serving end-user applications • greater emphasis on user interfaces • registry searches easily integrated into applications • Full release of latest improvements by January 2005 • Interoperable exchange between IVOA registries • Extensible Resource Metadata model • IVOA Registry Interface Standard is emerging What’s next: shift from development to curation • Finalize RI standard • Address curation issues • No talk on registries next year ADASS 2004 - Pasadena