160 likes | 288 Views
T HE US N ATIONAL V IRTUAL O BSERVATORY. Astronomical Dataset Query Language (ADQL). Ray Plante. Intersections and Goals. Users Will primarily use portal interfaces to query catalogs in pursuit of science Goals: Create original ADQL queries using the OpenSkyPortal
E N D
THE US NATIONAL VIRTUAL OBSERVATORY Astronomical Dataset Query Language (ADQL) Ray Plante NVO Summer School 2006 - Aspen
Intersections and Goals • Users • Will primarily use portal interfaces to query catalogs in pursuit of science Goals: • Create original ADQL queries using the OpenSkyPortal • Cross-correlate two catalogs using the XMatch function • “Scripters” and Data Providers • Will use client libraries to query catalogs • In pursuit of science • requires more queries and/or repetition than is practical to do by hand. • Will use a toolkit to deploy a service that uses ADQL (e.g. SkyNode) Goals: • Understand the role of ADQL/x in supporting query web services • See how ADQL will be used with other services beyond SkyNodes • Developers • Will develop new services and applications that use ADQL • Will extend existing code that uses ADQL Goals: • Use library to convert between ADQL/s & ADQL/x • Understand how ADQL can be adapted to new local query languages NVO Summer School 2006 - Aspen
What is ADQL? • ADQL = subset of SQL92 + astronomy extensions • Originally developed as part of SkyNode interface • Focuses on read-access via SELECT • Later versions allow creating tables to hold intermediate results • astronomy: regions and cross-matching • ADQL has two forms: • ADQL/s – string form • Looks like SQL • Intended as user’s view of ADQL • ADQL/x – XML form • SQL “parse tree” • Format used in messages passed to search services • Easier to parse and transform into local query NVO Summer School 2006 - Aspen
Caution: Construction ahead • ADQL is undergoing rapid change as a standard • OpenSkyPortal and existing SkyNodes use ADQL v0.7.4 • This talk will focus on this version • IVOA is currently considering v1.05 • Discussion is underway to simplify ADQL in the context of a Table Access Protocol • Bringing a core syntax in compliance with SQL92 • De-emphasizing the role of ADQL/x by supporting ADQL/s directly on the wire • Sharpening the definitions of cross-match and region functions NVO Summer School 2006 - Aspen
Why ADQL/x? • If ADQL is standard SQL (plus) and we are querying SQL databases, why do we need an XML form? • Query Transformation is commonly necessary • Few databases are 100% compliant with the SQL standard. • Transform to local SQL dialect • Semantic filtering possible (transforming metadata). • Easier to adapt to non-relational databases • E.g. XML database • A pre-parsed form on the wire makes transformations easier to implement • ADQL/x on the wire • Shifts parsing problem to the client – ACCESS BARRIER! • Experience • Minor transformations can often be handled via SQL string manipulation • More careful adherence to SQL92 would eliminate most common difference between native SQLs (TOP, functions) • The emergence of parser/conversion tools make choice of wire format less important NVO Summer School 2006 - Aspen
Why ADQL/x? If ADQL is standard SQL (plus) and we are querying SQL databases, why do we need an XML form? Query Transformation is commonly necessary • Few databases are 100% compliant with the SQL standard. • Transform to local SQL dialect • Semantic filtering possible (transforming metadata). • Easier to adapt to non-relational databases • E.g. XML database NVO Summer School 2006 - Aspen
What is ADQL? SELECT o.objId, o.ra, o.r, o.type, t.objId FROM SDSS:PhotoPrimary o, TWOMASS:PhotoPrimary t WHERE XMATCH(o,t,3.5,1) AND Region('Circle J2000 181.3 -0.76 6.5') AND o.type=3 NVO Summer School 2006 - Aspen
ADQL for UsersRestrictions on SQL92 • Only SELECT statements allowed • v1.0: SELECT INTO for saving results in remote storage • All tables in FROM must define an alias FROM PhotoPrimary o • JOIN keywords not supported • Traditional inner joins supported • Outer joins not supported • IN keyword not supported p.id IN (23872871, 2309823, 1512342) NVO Summer School 2006 - Aspen
ADQL for UsersAstronomical Extension: Regions • Does a position fall within a region of sky? • Uses Space-time Coordinate metadata schema • Also an emerging IVOA standard • Has string and XML formats • Allows for a several region shapes, coordinate systems • Current SkyNode implementations only support CIRCLE J2000 WHERE REGION('CIRCLE J2000 ra dec radius') • ra dec = right ascension, declination in degrees • radius = angular radius of cone, in degrees NVO Summer School 2006 - Aspen
ADQL for UsersJoining Tables • Implied Inner Joins • WHERE constraint where key in one table is set to key in other table p.objId = s.objId • This is only practical for joining tables from the same catalog/node • ADQL/SkyQuery was motivated by the need to cross-correlate distributed tables by position • XMatch: extension for joining based on matching positions. • Current practice: a statistical analysis of the likelihood that two positions are the same NVO Summer School 2006 - Aspen
ADQL for UsersAstronomical Extension: XMatch • Catalog/Node designation: • Tables in FROM class are preceeded by node name FROM SDSS:PhotoPrimary s, TWOMASS:PhotoPrimary t • XMatch function • Arguments: tables to cross-match • Referred to by the table aliases • Returns: chi-squared confidence that two objects are the same XMATCH(s, t) < 3.5 • given sources in each table are the same with a confidence of better than 3.5 sigmas • Can include negative detection XMATCH(s, t, !o) < 3.5 • Detected in tables s and t but not in o NVO Summer School 2006 - Aspen
ADQL for Users/Data ProvidersThe XMatch Algorithm • ADQL does not set cross-match algorithm • Requires positional information and positional errors • SkyNode recommends chi-square minimization • x, y, z is the unit vector position of the most likely position of an object • xn, yn, zn is the measured position in a survey • αn is a weighting based on the positional error NVO Summer School 2006 - Aspen
ADQL for Users/Data ProvidersUsing XMatch • Limiting the cross-matched candidates • In principle, analysis can be carried out on every possible combination of detections • Techniques for limiting cross-matches • User provides a limited region constraint • User provides other filtering constraints (e.g. color) • Implementation can apply internal X2 cut-off based on given overall limit • Invoking XMatch causes additional info to be returned • Most likely position, the original positions • Χ2, Minimized Chi-squared • SkyNodes: intermediate values in statistical calculations • A single detection’s position can appear in multiple cross-matched records NVO Summer School 2006 - Aspen
ADQL for ScriptersAccessing ADQL Services • OpenSkyPortal web service takes ADQL/s • querying SkyNodes is straight-forward through portal service • Querying an individual SkyNode service • Requires use of ADQL/x • Client-side support is not great but improving • As a user, would like to give ADQL/s • Use convertADQL to convert ADQL/s to ADQL/x on the fly NVO Summer School 2006 - Aspen
ADQL for ScriptersOther uses of ADQL • Searching Registries • Registry Interface v1.0 for this year • Specify only WHERE clause • Use simple XPaths for column names Ex: WHERE content/description LIKE '%quasars%' • Future generations of SIAP, SSAP • Replace simple keyword=value query • More fine-grained control over search criteria NVO Summer School 2006 - Aspen
ADQL for DevelopersADQLlib • Parses and converts between string & XML versions • Can be incorporated into client code to provide support for ADQL/s • Java clients can use Java API directly • Other clients can use the convertADQL command-line tool • ADQL/x -> ADQL/s • Multiple transformation implementations, can be supported • Default implementation uses XSLT • Can be configured to convert from ADQL/x directly to a RDBMS’ local variant of SQL NVO Summer School 2006 - Aspen