190 likes | 356 Views
I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE. IVOA Data Access Layer Table Access Protocol Analysis. Doug Tody (NRAO/NVO ). TAP Context. Architecture Cross-match portal/application Table Access Protocol ADQL specification VOSpace, UWS, SSO, etc. Role of TAP
E N D
INTERNATIONAL VIRTUAL OBSERVATORY ALLIANCE IVOA Data Access LayerTable Access Protocol Analysis Doug Tody (NRAO/NVO) IVOA Interop, Cambridge UK, 2007
TAP Context • Architecture • Cross-match portal/application • Table Access Protocol • ADQL specification • VOSpace, UWS, SSO, etc. • Role of TAP • Direct access to table data at a single site • Support for higher level distributed queries • Broader future role in DAL (complex data etc.) IVOA Interop, Cambridge UK, 2007
Primary TAP Use-Cases • Complex/Large Table Query • Large query (async, vospace, authentication) • Multi-table operations (join etc.) • Multi-region queries (table upload) • Advanced ADQL/SQL capabilities Required to support cross-match portal, advanced apps Full functionality is required • Simple Table Query • Filter-type operation upon a single table • Most basic astronomical catalog access is of this type • ADQL, async useful but not required for simple queries Probably sufficient for most small data providers Cone search is not enough IVOA Interop, Cambridge UK, 2007
Primary TAP Use-Cases • Table Metadata Query • Metadata describing stored data is also data • can be virtual, subsetted, transformed, etc. • Client application queries TAP service for available data • tables, table columns, relationships, etc. • service metadata (capabilities etc) is a separate issue Basic metadata model should be simple Extensibility required for advanced query support • Data Access Query • This is "ADQL integration into DAL" • ADQL query against a DAL data model (complex data etc.) IVOA Interop, Cambridge UK, 2007
Key Requirements/Issues • ADQL and Grid capabilities • Motivation • Required for portals and advanced applications • Needs ADQL, multi-region, async, vospace, sso, etc. • Issues or Options • Not controversial: everyone agrees we need this • Not required however for basic usage • Complex; will take time to prototype, specify, standardize • Unrealistic to expect community implementation w/o frameworks IVOA Interop, Cambridge UK, 2007
Key Requirements/Issues • Simple Query capability • Motivation • Provide simple basic table access capability • Needed anyway for simple table metadata queries • Adequate for most simple filter-type queries of single table • Supplants cone search; much more powerful but still simple • Provide robust implementation while we develop advanced stuff • Issues or Options • Some want to make ADQL mandatory • "all data should be at data centers" • Options: legacy cone search plus ADQL-TAP • but cone search is too limited IVOA Interop, Cambridge UK, 2007
Key Requirements/Issues • TAP Information Schema • Motivation • Provide uniform access to both table data and metadata • Same query/access interface used for both • Supports virtual data, dynamic queries, format options, etc • Easily extended without changing interface • Don't do one thing now, another later • Issues or Options • Need to specify/agree upon minimal core metadata • Strategy: Adopt registry table model with minor changes • Other options: VOTable with no data, literal registry XML IVOA Interop, Cambridge UK, 2007
Key Requirements/Issues • Proposed Core TAP/Registry Table Schema • Table • name [[catalog.]schema.]table • type base table, view, output, etc. • description table description • Column • name column name • tableName table name • description column description • unit unit in VO standard format • ucd UCD if any • utype UTYPE if any • dataType dataType as in VOTable/registry • arrayShape array "shape"/size as in VOTable/registry • std standard column (else custom addition) IVOA Interop, Cambridge UK, 2007
TAP Design Study • History • Based upon work done by ESAC/VOQL-TEG and DAL WG in spring 2007 • Also NVO tiger team, SkyNode experience, data center experience • TAP Design Goals • Provide capability for ADQL queries to support advanced analysis • Define minimal implementation • for small data provider, common queries • replace legacy cone search with more general facility • Both data access and metadata access supported natively by service • Provide for scalability, in particular multi-position queries • Support Grid capabilities, i.e, async, staging, authentication • TAP should be consistent with other DAL interfaces where possible • Provide registry integration for automated service discovery IVOA Interop, Cambridge UK, 2007
TAP Interface Summary • Form of interface • HTTP GET/POST based (other protocols possible, e.g. SOAP, CEA) • Multiple output formats (VOTable, CSV/TSV, XML, VOSpace, etc.) • Operations • AdqlQuery ADQL-based queries, full functionality • SimpleQuery Simple data queries, metadata queries • GetCapabilities Return metadata describing the service • GetAvailability Monitor runtime service function and health IVOA Interop, Cambridge UK, 2007
AdqlQuery Operation • Scope and Form of Interface • General capability for ADQL-based queries • Both GET and POST versions are required • GET is synchronous, indempotent, simple, RESTful • POST required for async, staging, large queries • Semantics, e.g., parameters, identical for both versions • ADQL query is URL-encoded so use in GET is not a problem • Parameters • QUERY The query string (ADQL; URL-encoded) • FORMAT Output data format (VOTable, CSV, XML, etc.) • <staging> Only used in POST version; for VOSpace • <async> Only used in POST version; for driving UWS • MAXREC Maximum records in the output table • RUNID Pass-through; used for logging (others TBD) IVOA Interop, Cambridge UK, 2007
AdqlQuery Operation • Field Names, UTYPE and UCD • Suggest this be done at level of field rather than by operation • Literal field names directly access database table • A UTYPE reference resolves into a literal table field name • e.g., “ssa:Target.Name” resolves to table field “TargetName” • UTYPE (in this context) is a special case of UTYPE ("ucd:") • Field name resolution • Both literal and UTYPE/UCD field names resolve to table field • All queries evaluated equivalently after field name resolution • Data models, at the level of TAP, involve only mappings • UFI can automate this, or it can be done client side IVOA Interop, Cambridge UK, 2007
AdqlQuery Operation • Multi-Position Queries • AKA multi-cone search; but doesn't have to be limited to position • Common use-case involves user source list with thousands of positions • Required for scalability to reduce operation overhead • How It Works • Uses ADQL, REGION, POST form of operation • VOTable used to upload source table (ID, POS, SIZE, etc.) • other fields are passed through to output • output is tagged by source ID • can be generalized to any input parameter, not just position • POST (e.g., multipart/form-data) used to upload params, VOTable • Parameters are common to both GET and POST forms • Data Scoping • Query, Local (DBMS), and VOSpace (Net) tables are equivalent • POST is a Query space table IVOA Interop, Cambridge UK, 2007
SimpleQuery Operation • Scope and Form of Interface • Provides capability for simple non-ADQL queries • Used for both data queries and metadata queries (like ADQL/SQL) • Only a synchronous GET version is required • Only a single table is queried at a time • Motivation • Simple to implement, easy to use • >90% of actual catalog queries are simple filters of a single table • We need something like this anyway for simple metadata queries • but why limit it to only metadata? • Small data providers publish a few simple catalogs • Simpler to implement, likely to be more robust implementation IVOA Interop, Cambridge UK, 2007
SimpleQuery Operation • Parameters • SELECT Table fields to be returned (default all) • FROM The table (or view) to be accessed • WHERE A filter to be applied to the table (default none) • POS,SIZE Find data only in this spatial region • FORMAT Output data format • MAXREC Maximum records out • RUNID Pass-through for logging (etc) • Provides • Simplified SQL-lite query (90/10 rule) • Both data and metadata queries • Simple cone search capability IVOA Interop, Cambridge UK, 2007
SimpleQuery Operation • Metadata Queries • Information Schema concept • great concept; definition/implementation imperfect • but it is a standard, widely (but not completely) implemented • Concept • represent database/table metadata as data tables (views) • allows use of standard data table interface to query metadata • easily extensible without changing service interface • views can be used for things such as registry view • Examples • FROM=SCHEMA.tables • FROM=SCHEMA.columns&WHERE=tableName,foo • FROM=SCHEMA.columns&WHERE=tableName,foo&FORMAT=xml IVOA Interop, Cambridge UK, 2007
Simple Cone Search • Approach • Integrate into SimpleQuery to allow additional constraints • would probably be too ambitious in a separate SCS standard • Re-use common DAL position syntax (POS, SIZE) • extensible in terms of region type and spatial frame • UTYPE/UCD field syntax allows data models to be used • Table to be queried is specified with FROM • ADQL,REGION provides an advanced alternative with common semantics • Examples • REQUEST=SimpleQuery&FROM=foo&POS=180.0,12.5&SIZE=0.2 • REQUEST=SimpleQuery&FROM=foo&POS=180.0,12.5&SIZE=0.2&WHERE=flux,5/ IVOA Interop, Cambridge UK, 2007
Minimal TAP Service • Requirements • Implements SimpleQuery operation • possibly getCapabilities and getAvailability as well? • Provides basic data query capability • Provides basic metadata query capability (tables, columns) • No ADQL support required (but may use SQL back end) • No UTYPE support required IVOA Interop, Cambridge UK, 2007