130 likes | 289 Views
T HE US N ATIONAL V IRTUAL O BSERVATORY. How to Navigate VO Datasets Using VO Protocols. Ray Plante (NCSA/UIUC), Thomas McGlynn and Eric Winter NASA/GSFC. Summary. Data Discovery: Using the VO Registry Data Recovery: Protocols VOTable UCDs. Example Using the Registry.
E N D
THE US NATIONAL VIRTUAL OBSERVATORY How to Navigate VO Datasets Using VO Protocols Ray Plante (NCSA/UIUC), Thomas McGlynn and Eric Winter NASA/GSFC ADASS 2003
Summary • Data Discovery: Using the VO Registry • Data Recovery: • Protocols • VOTable • UCDs ADASS 2003
Example Using the Registry use SOAP::Lite; # Install the Perl SOAP library. my $soap = SOAP::Lite # Locate the SOAP service -> uri('http://www.us-vo.org') -> on_action( sub { join '/', 'http://www.us-vo.org', $_[1]} ) -> proxy('http://sdssdbs1.stsci.edu/nvo/registry/Registry.asmx'); my $method = SOAP::Data->name('QueryRegistry') # The method to invoke ->attr({xmlns=> 'http://www.us-vo.org'}); ... # Specify the parameters of the method call my @params = (SOAP::Data->name("predicate" => "ServiceType LIKE 'SIAP%' and ContentLevel='Research'") ); my $result = $soap->call($method => @params); # Query the remote service. ... # Loop over the results foreach ($result->valueof('//SimpleResource')) { ... if($$_{ServiceType} eq "SIAP/Cutout"){ ... Handle a cutout service } elsif (($$_{ServiceType} eq "SIAP/Archive") && ($$_{Title} eq $cxc)){ ... Handle an Archive service } else { ... Default; } } ADASS 2003
Registry usage issues • Straightforward, fast and flexible access. • Using Registry as Web service. • Need to install SOAP for environment to be used. • Interface is not yet standardized, so details of specific implementation are exposed. • SQL style query (i.e., SQL WHERE clause). The ultimate syntax may use something more like XQuery • Cryptic magic in some calls needs to be done properly (e.g., on_action argument in the constructor). Users need to copy from working examples. • Content of Registry still in some flux • Detailed and final specification of service metadata. • Hierarchical database issues. ADASS 2003
Querying the registry we can easily obtain lists of various kinds of resources and use the associated metadata to organize them however we wish. ADASS 2003
Protocols • Cone search provides access to anything that returns a table regarding a position. • Object tables: Lists of distinct astronomical objects • Observation tables: Lists of pointed observations • No standard link from observation tables to archival data yet, but data set ID’s may provide such. • SIAP Archives • Users get static, often ‘rawish’, data (Chandra, ADIL) • May get many images returned from the same dataset (i.e., lots of Chandra images of a given field). • SIAP Services • Users get data customized to their invocation (DPOSS, SkyView) • Typically get only one or a few images from a given service but several different services may be returned by the same SIAP server. E.g., SkyView returns images from many different surveys – but only one of each. • SIAP retrievals are a two step process. The SIAP server is in essence a registry service giving data available at a given location. ADASS 2003
Cone Search and SIAP Examples BEGIN { # Avoid HTTP 2.0 chunking (Perl doesn’t like it!) $ENV{PERL_LWP_USE_HTTP_10} = 1; } use LWP::UserAgent; # Standard Perl libraries from CPAN use URI::URL; use HTTP::Request; ... Set up a base URL for the service $url .= "POS=$ra,$dec&SIZE=$size"; # SIAP # "RA=$ra&DEC=$dec&SR=$size; # Cone search my $u = URI::URL->new($url); my $req = HTTP::Request->new("GET", $url); my $ua = LWP::UserAgent->new(); my $resp = $ua->request($req); ... Process the response ... ADASS 2003
Protocol Issues • SIAP and Cone Search are invoked almost identically for minimal interface. • Lots of additional capabilities may be available in SIAP, but very few are required to be supported by the server. • Metadata queries use special forms. • SR=0 for Cone search asks for metadata on returned table. • FORMAT=METADATA keyword used to get metadata from SIAP. • Both return VOTables. • In SIAP this describes available images. • SIAP may return multiple entries for same image in different formats. Links between these are not standardized. ADASS 2003
Reading VOTables use VOTable::Document; # VOTable library. ... my $doc = VOTable::Document->new_from_string($xstring); my @votarr = $doc->get_votable(); my $vot = $votarr[0]; my @resarr = $vot->get_resource(); foreach my $res (@resarr) { # Loop over the resources in the VOTable my @tabarr = $res->get_table(); foreach my $tab (@tabarr) { # Loop over the tables within the Resource my $data = $tab->get_data(); if ($data) { $nRow = $data->get_num_rows(); } my $ra = $tab->get_field_position_by_ucd("POS_EQ_RA_MAIN"); # Find RA/Dec columns my $dec = $tab->get_field_position_by_ucd("POS_EQ_DEC_MAIN"); my @fields = $tab->get_field(); for ($i = 0; $i < $nRow; $i += 1) {# Loop over the rows within the table my @rowdata = $tab->get_row($i); for ($j=0; $j <= $#rowdata; $j += 1) {# Loop over the columns within the row my $element = $rowdata[j]; ... This is the the row_i, column_j element in the table. } } ADASS 2003
VOTable Issues • VOTables can be complex • Most current tables are simple but ID attribute may be useful for complex VOTables. • Need to handle arrays of resources and tables. • Formats of SIAP and Cone search results are better constrained. • Streaming versus trees • Most libraries support one paradigm easily and the other with some difficulty. Trees are easier but run into limits handling > 105 rows. • UCDs versus column names • Protocols refer to UCDs but particular applications may require specific columns. • Support for aggregate quantities (e.g., ra,dec->position) likely in updates. ADASS 2003
ClassX Correlation The ClassX cross-correlator uses small XML files to describe what VOTable enabled services to query, what fields to extract, and how to combine information from multiple tables. With consistently defined protocols and output formats, only these small control files need to be changed to correlate tables from VizieR, the HEASARC and many other sites. Remote catalogs What fields are need in the results? Single query results Defines the services to be queried or UCDs we are interested in. Join criteria and output filter A B C Correlator Target Data Results (VOTABLE) D ADASS 2003
UCDs • SIAP and Cone search protocols levy requirements that columns with certain UCDs are present. • Position • Links to actual data file and format for SIAP • These UCDs are pretty much the only thing you are guaranteed to get in the output. • UCDs may indicate appropriate candidates for cross-correlation • UCD structure likely to change in the near term. • Modifiers like ‘main’, ‘error’ • UCDs for aggregate quantities • Use UCDs for column discovery (i.e., when the structure of the returned table is unknown), use column names for column query. ADASS 2003
Summary • Use registries to find resources • Example:http://sdssdbs1.stsci.edu/nvo/registry/Registry.asmx • Use standard protocols to query resources • Cone search: http://us-vo.org/metadata/conesearch/index.html • SIAP: http://www.aoc.nrao.edu/~dtody/sim.html • Descend the hierarchical structure of the VOTable • VOTable specification: http://us-vo.org/VOTable/VOTable-1-0.htm • Libraries: • Perl: http://heasarc.gsfc.nasa.gov/classx/pub/votable/dist/VOTable.tar.gz • Java: http://us-vo.org/VOTable/JAVOT/JAVOT.zip • C/C++: http://vo.iucaa.ernet.in/~voi/cplusparser_stream.htm • Use UCDs to find columns of interest. • UCD info and tools: http://cdsweb.u-strasbg.fr/UCD/ ADASS 2003