460 likes | 654 Views
US N ATIONAL V IRTUAL O BSERVATORY. Data Access Layer. Doug Tody (NRAO). Data Access Layer. What does it do? Provides access to data data discovery mediation to a standard model data retrieval on-demand data generation server-side computation (subsetting, filtering) What is it for?
E N D
US NATIONAL VIRTUAL OBSERVATORY Data Access Layer Doug Tody (NRAO) NVO Summer School, Aspen 9-Sep-2005
Data Access Layer • What does it do? • Provides access to data • data discovery • mediation to a standard model • data retrieval • on-demand data generation • server-side computation (subsetting, filtering) • What is it for? • Supports client data analysis • distributed, multiwavelength • How does it work? • Object (dataset) oriented • catalog, image, spectrum, time series, SED, etc. • Services • cone search (also SkyNode), SIA, SSA NVO Summer School, Aspen 9-Sep-2005
Cone Search NVO Summer School, Aspen 9-Sep-2005
Cone Search • Provides basic catalog access • Query by position and aperture (cone in space) • Query consists of base-URL (service endpoint) plus parameters • e.g., http://base-url%RA=12.0&DEC=0.0&SR=1.0 • Catalog returned as a VOTable • Advantages • Simple but powerful, provides standard interface • Easy to implement and use • Limitations • Catalog metadata is not defined • No data model support • Future • Supplanted by basic SkyNode (Greene, Saturday) • Supports metadata discovery, SQL-like syntactical queries • We will continue to support the basic cone search query however! NVO Summer School, Aspen 9-Sep-2005
Simple Image Access NVO Summer School, Aspen 9-Sep-2005
Simple Image Access (SIA) • Basic Usage, Highest Level • Client queries Registry to find interesting services • Each service is queried (in turn or simultaneously) for data • Client collates and analyzes results • Selected datasets are retrieved NVO Summer School, Aspen 9-Sep-2005
Simple Image Access (SIA) • Basic Usage, Single Service • Query • find data of interest from a single service • http://base-url %POS=12.0,0.0&SIZE=0.2&FORMAT=image/fits • Query response • VOTable, one row per candidate dataset • "access reference" (a URL) points to data • Data selection • Performed by the client using query response metadata • Dataset retrieval • Retrieve actual datasets, if any NVO Summer School, Aspen 9-Sep-2005
Service Capabilities • Types of Services • Atlas Precomputed survey image (entire image) • Pointed Image from pointed observation (entire image) • Cutout Cutout existing image (pixels unchanged) • Mosaic Reprojected image (pixels resampled) • Virtual Data • Data model mediation • Subsetting, filtering, etc. on the fly • Possible to view same data in different ways • Interface • RESTful interface currently (HTTP GET) • Document oriented (VOTable, FITS, JPEG, etc.) NVO Summer School, Aspen 9-Sep-2005
Data Model • SIA data model is the familiar "astronomical image" • Generally this means a 2D sky projection • Data array is logically a regular grid of pixels • Encoded as a FITS image, GIF/JPEG, etc. • Standardized dataset metadata • Provenance • Image geometry • Scale • Format • Position, WCS • Time of observation • Spectral bandpass • Access information NVO Summer School, Aspen 9-Sep-2005
Input Parameters • Required parameters • POS center of ROI (ra, dec decimal degrees ICRS) • SIZE width; or width, height • FORMAT ALL, GRAPHIC, image/fits, image/jpeg, text/html,… • Optional parameters • INTERSECT values: covers, enclosed, center, overlaps • VERB table verbosity • Service-defined parameters • used to further refine queries, but not yet standardized • e.g., BAND, SURVEY, etc. • Image generation parameters • NAXIS, CFRAME, EQUINOX, CRPIX, CRVAL, CDELT, ROTANG, PROJ • used for cutout/mosaic services to specify image to be generated NVO Summer School, Aspen 9-Sep-2005
Query Response • Output is a VOTable • Must contain a RESOURCE element with tag="results", containing the results of the query. • The ‘results’ resource contains a single table • Each row of the table describes a single data object which can be retrieved. • The fields of the table describe the attributes of the dataset • These are the attributes of the SIA data model • In SIA 1.0, the UCD is used to identify the data model attribute • e.g., POS_EQ_RA_MAIN, VOX:Image_Scale, etc. NVO Summer School, Aspen 9-Sep-2005
Query Response • Image metadata • Describes the image object (required) • Coordinate system metadata • Image WCS • Spectral bandpass metadata • Prototype data model describing spectral bandpass of image • Processing metadata • Tells whether the service modified the image data • Access metadata • Tells client how to access the dataset (required) • Resource-specific metadata • Additional optional service-defined metadata describing image NVO Summer School, Aspen 9-Sep-2005
Image Metadata VOX:Image_Title Brief description of image POS_EQ_RA_MAIN Ra (ICRS) POS_EQ_DEC_MAIN Dec (ICRS) INST_ID Instrument name VOX:Image_MJDateObs MJD of observation VOX:Image_Naxes Number of image axes VOX:Image_Naxis Length of each axis VOX:Image_Scale Image scale, deg/pix VOX:Image_Format Image file format NVO Summer School, Aspen 9-Sep-2005
Image Retrieval • Completely optional • Typically only a fraction of the available images are retrieved • Query response • If an access reference is provided, the data can be retrieved • SIAP can also be used to describe data which is not online • The same data may be available in multiple formats • Image retrieval • Very simple; access reference is a URL • Standard tools can be used to fetch the data • (browser, wget, curl, i/o library, etc.) • Data is often computed on-the-fly • All retrieval is synchronous (currently) • No provision for restricting access (currently) NVO Summer School, Aspen 9-Sep-2005
Service Registration NVO Summer School, Aspen 9-Sep-2005
Future Development • SIA V1.1 • Based on work done on SSA • Expanded query interface • no longer limited to positional queries • Much richer query response • generic dataset identification, characterization, etc. • metadata extension mechanism • Selected features • VOTable 1.1 with UCD 1+, GROUP, UTYPE • query response can be ordered by "score" • logical groupings of related query records • compression support • Versioning • required to make protocol upgrades manageable NVO Summer School, Aspen 9-Sep-2005
Future Development • Service verification • for testing at development time • when registered; level of compliance metric • Grid capabilities • Data staging • asynchronous image generation (long running jobs) • batch generation of images (multiple images) • Data management • support for single sign-on authentication, authorization • network data caching, third party delivery (VOStore etc.) • Web service interface • resource metadata • service availability (etc.) • ADQL integration • Capability to use query language for queries NVO Summer School, Aspen 9-Sep-2005
Simple Spectral Access NVO Summer School, Aspen 9-Sep-2005
Simple Spectral Access (SSA) • What is it? • Provides access to 1Dspectra, time series, SEDs • Tabular spectrophotometric data (photometry points) • Represents second generation, data model-based DAL interfaces • Status • Draft V0.9 query interface reviewed in Kyoto (May 05) • Revisions in progress; draft PR targeted for Madrid (Oct 05) • Much work on data models however still being revised • Some initial prototypes already exist (services, client apps) • IVOA/Madrid discussions will be held immediately after the ADASS and are open to all NVO Summer School, Aspen 9-Sep-2005
Basic Usage • SSA specification may be complex, but basic usage is simple • Simple query • POS, SIZE, FORMAT - like cone search, SIA • Possibly refined by spectral or time bandpass, etc. • Most metadata in query response is optional • Data retrieval • Simple retrieval is again URL-based • Get back a dataset "document" (VOTable, FITS, JPEG, etc.) • In simplest case could be wavelength, flux as text (for Spectrum) • Pass-through of external data is permitted • Data Analysis • Standard data model isolates application from quirks of • external project data NVO Summer School, Aspen 9-Sep-2005
Concepts - Dataset-oriented • Data object type • Spectrum, TimeSeries, SED • Dataset creation type • Atlas Whole datasets, uniform survey data • Pointed Whole datasets, variable instrumental data • Cutout Subset, data samples are not modified • Resampled Subset, data samples computed by service • Dataset derivation • Observed An observation • Composite Combination of several observations • Simulated Simulated observation made from real data • Synthetic Data from a theoretical model NVO Summer School, Aspen 9-Sep-2005
Data Models • Data models used in SSA • Spectral data Spectrum, TimeSeries, SED • Dataset Generic dataset descriptor • Target Astronomical target observed • Curation Origin of data • Characterization Physical characteristics of data • Provenance Instrument which generated the data • User defined data models • Metadata extension mechanisms • additional data model attributes (table fields) • additional resources in VOTable, linked back to main table • Provide a mechanism to "subclass" dataset to tailor it for a given data collection NVO Summer School, Aspen 9-Sep-2005
Spectral Data (SED) Photometry point spectrum segment NVO Summer School, Aspen 9-Sep-2005
Spectral/SED Data Model NVO Summer School, Aspen 9-Sep-2005
Query Interface • Mandatory query parameters • POS RA, DEC (ICRS) • SIZE diameter (decimal degrees) • TIME data1,date2 (epoch in decimal years UTC) • BAND wave1,wave2 (meters in vacuum; source or observer) • FORMAT VOTable, fits, xml, text, graphics, html, external NVO Summer School, Aspen 9-Sep-2005
Query Interface • Recommended query parameters • APERTURE approx spatial resolution (decimal degrees) • SPECRES spectral resolution (meters) • TOP number of top-ranked records to return • OBJTYPE mandatory if service returns multiple object types • COLLECTION data collection identifier NVO Summer School, Aspen 9-Sep-2005
Query Interface • Optional parameters • CREATORID creator-assigned dataset identifier (at most 1) • PUBID publisher-assigned dataset identifier (at most N) • COMPRESS enable compression (for both data _and_ queries?) • SNR signal-to-noise ratio • REDSHIFT redshift range (dlambda/lambda) • TARGETCLASS star, galaxy, pulsar, PN, QSO, AGN, etc. NVO Summer School, Aspen 9-Sep-2005
Query Response • Classes of query metadata • Query metadata Describes the query itself • Dataset metadata Describes data object; object-specific • Target metadata Astronomical target • Curation metadata External identification of dataset • Characterization Coverage, Accuracy, Frame, etc. • Instrument metadata Service-defined; hard to standardize • Access metadata Describes how to access the dataset NVO Summer School, Aspen 9-Sep-2005
Query Response • Query Metadata • Query.Score How well object matches query • Query.LName Logical name (identifier) • Query.LNameKey Logical name key (id-ref) • Example: LName="MyObj123" LNameKey="server,format" NVO Summer School, Aspen 9-Sep-2005
Query Response • Dataset Metadata • Dataset.Type Spectrum, TimeSeries, SED, etc. • Dataset.DataModel DM name, e.g., "SSA-V0.90" • Dataset.Title Brief descriptive title of dataset • Dataset.SSA.NSamples Total samples in dataset Dataset.SSA.Aperture Characteristic aperture diameter • Dataset.SSA.TimeAxis TimeCoord axis (external data) • .SSA.SpectralAxis SpectralCoord axis (external data) • Dataset.SSA.FluxAxis Flux axis (external data) • Dataset.CreationType atlas, pointed, cutout, resampled • Dataset.Derivation observed, composite, simulated, synthetic NVO Summer School, Aspen 9-Sep-2005
Query Response • Target Metadata • Target.Name Name of astronomical object • Target.Class Target class (star, galaxy, QSO, etc.) • Target.SpectralClass Spectral class (e.g., 'O', 'B', etc.) • Target.Redshift Nominal redshift for object • Derived.VarAmpl Variability amplitude (fraction 0-1) • Derived.SNR Observed signal to noise ratio NVO Summer School, Aspen 9-Sep-2005
Query Response • Curation Metadata • Curation.Collection Data collection name (identifier) • Curation.Creator Creator identify (identifier) • Curation.CreatorID Creator-assigned dataset identifier • Curation.PublisherID Publisher-assigned dataset identifier • Curation.Date Dataset creation date (ISO date string) • Curation.Version Dataset version (within same ID) NVO Summer School, Aspen 9-Sep-2005
Query Response • Characterization1 - Coverage • .Location.Spatial Position (e.g., RA, DEC) • .Location.Time Observation time characteristic value • .Location.Spectral Spectral bandpass characteristic value • .Location.Spectral.BandID Bandpass ID (band or filter name) • .Bounds.Spatial Aperture footprint (polygon on sky) • .Bounds.Time Low/High time values • .Bounds.Spectral Low/High spectral values • .Bounds.Flux Limiting flux, saturation limit (Jansky) • .Fill.Spatial Spatial sampling filling factor (0-1) • .Fill.Time Time sampling filling factor (0-1) • .Fill.Spectral Spectral sampling filling factor (0-1) NVO Summer School, Aspen 9-Sep-2005
Query Response • Characterization2 - Accuracy • Accuracy.*.Calibrated uncalibrated, relative, absolute • Accuracy.*.Resolution Resolution of measured signal • Accuracy.*.StatErr Statistical error (measured) • Accuracy.*.SysErr Systematic error (estimated) ('*' = Spatial, Time, Spectral, Flux) NVO Summer School, Aspen 9-Sep-2005
Query Response • Characterization3 - Reference Frames • Frame.Spatial.Type Coordinate frame (default ICRS) • Frame.Spatial.Equinox Coordinate system equinox (J2000) • Frame.Time.System Timescale (TT) • Frame.Time.SIDim SI factor and dimension • Frame.Spectral.SIDim SI factor and dimension • Frame.Flux.SIDim SI factor and dimension • Frame.Flux.UCD UCD of flux value (flux type) (These apply only to the query response) (SIDim metadata still under construction) NVO Summer School, Aspen 9-Sep-2005
Query Response • Instrument Metadata • Instrument.Name Instrument name (identifier) • Instrument.Exposure Total exposure time (seconds) • Instrument.<other> Service-defined • Notes • Optional; provided for instrumental data collections • In general, Collection, Bounds.Time, etc. are preferred • In general Instrument metadata is service-defined • Use Observation model as a starting point NVO Summer School, Aspen 9-Sep-2005
Query Response • Access Metadata • Access.Reference Data access URL • Access.Format MIME type of returned dataset • Access.Size Approximate dataset size (bytes) • Access.Server Server endpoint URL • Staging support goes here in the future • e.g., will dataset access require asynchronous staging • estimated cost to construct dataset NVO Summer School, Aspen 9-Sep-2005
Service Metadata • Usage • Describe service type and capabilities • Characterize service (data resources served, coverage, etc.) • Describe interface (optional query parameters) • Interface • Requires new service metadata query method • Returns resource metadata descriptor (XML) • Format • Registry resource descriptor (XML) NVO Summer School, Aspen 9-Sep-2005
Data Retrieval • Based on GET as with SIA • Variety of formats available • Compression supported • Data representation • Data model defines logical content of data • The same data object may be represented in various formats • Hence we need to specify both the data model, and the file format NVO Summer School, Aspen 9-Sep-2005
Data Retrieval • Data models • SSA data model for fully-compliant data • Provider-defined data model for external data • Data formats • VOTable (a container), native XML (direct serialization) • FITS binary table (another container; uses FITS spectral WCS) • Text, e.g., CSV • Graphics (JPEG etc.) • text/html (rendered into browser page) NVO Summer School, Aspen 9-Sep-2005