1 / 18

Experience Building The World Wide Telescope aka: The Virtual Observatory

Experience Building The World Wide Telescope aka: The Virtual Observatory. Jim Gray Alex Szalay. The Evolution of Science. Observational Science Scientist gathers data by direct observation Scientist analyzes data Analytical Science Scientist builds analytical model Makes predictions.

Download Presentation

Experience Building The World Wide Telescope aka: The Virtual Observatory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experience Building The World Wide Telescope aka: The Virtual Observatory Jim Gray Alex Szalay

  2. The Evolution of Science • Observational Science • Scientist gathers data by direct observation • Scientist analyzes data • Analytical Science • Scientist builds analytical model • Makes predictions. • Computational Science • Simulate analytical model • Validate model and makes predictions • Data Exploration Science Data captured by instrumentsOr data generated by simulator • Processed by software • Placed in a database / files • Scientist analyzes database / files

  3. Information Avalanche Image courtesy C. Meneveau & A. Szalay @ JHU • In science, industry, government,…. • better observational instruments and • and, better simulations producing a data avalanche • Examples • BaBar: Grows 1TB/day 2/3 simulation Information 1/3 observational Information • CERN: LHC will generate 1GB/s .~10 PB/y • VLBA (NRAO) generates 1GB/s today • Pixar: 100 TB/Movie • New emphasis on informatics: • Capturing, Organizing, Summarizing, Analyzing, Visualizing BaBar, Stanford P&E Gene Sequencer From http://www.genome.uci.edu/ Space Telescope

  4. World Wide TelescopeVirtual Observatoryhttp://www.ivoa.net/ • Premise: Most data is (or could be online) • The Internet is the world’s best telescope: • It has data on every part of the sky • In every measured spectral band: optical, x-ray, radio.. • As deep as the best instruments (2 years ago). • It is up when you are up.The “seeing” is always great(no working at night, no clouds no moons no..). • It’s a smart telescope: links objects and data to literature on them.

  5. The WWT Components • Data Sources • Literature • Archives • Unified Definitions • Units, • Semantics/Concepts/Metrics, Representations, • Provenance • Object model • Classes and methods • Portals

  6. Data Sources • Literature online and cross indexed • Simbad, ADS, NED,http://simbad.u-strasbg.fr/Simbad, http://adswww.harvard.edu/, http://nedwww.ipac.caltech.edu/ • Many curated archives online • FIRST, DPOSS, 2MASS, USNO, IRAS, SDSS, VizeR,… • Typically files with English meta-data and some programs • Groups, Researchers, Amateurs Publish • Datasets online in various formats • Documentation varies • Publications are Ephemeral • Unknown provenance

  7. Unified Definitions • Universal Content Definitions http://vizier.u-strasbg.fr/doc/UCD.htx • Collated all table heads from all the literature • 100,000 terms reduced to ~1,500 • Rough consensus that this is the right thing. • Refinement in progress as people use UCDs • Defines • Units: • gram, radian, second, ... • Semantic Concepts / Metrics • Std error, Chi2 fit, magnitude, flux @ passband, velocity,

  8. Provenance • Most data will be derived. • To do science, need to trace derived data back to source. • So programs and inputs must be registered. • Must be able to re-run them. • Example: Space Telescope Calibrated Data • Run on demand • Can specify software version (to get old answers) • Scientific Data Provenance and Curation are largely unsolved problems (some ideas but no science).

  9. Object Model • General acceptance of XML • Recent acceptance of XML Schema (XSD over DTD) • Wait-and-See about SOAP/WSDL/… • “ Web Services are just Corba with angle brackets.” • FTP is good enough for me. • Personal opinion: • Web Services are much more than “Corba + <>” • Huge focus on interop • Huge focus on integrated tools • But the community says “Show me!” • Many technologists sold, but not the astronomers

  10. Classes and Methods • First Class: VO tablehttp://www.us-vo.org/VOTable/VOTable-1-0.htm • Represents an answer set in XML • Defined by an XML Schema (XSD) • Metadata (in terms of UCDs) • Data representation(numbers and text) • First method • Cone Search: Get objects in this cone

  11. Other Classes • Space-Time class • http://hea-www.harvard.edu/~arots/nvometa/STCdoc.pdf • Image Class (returns pixels) • SdssCutout • Simple Image Access Protocol http://bill.cacr.caltech.edu/cfdocs/usvo-pubs/files/ACF8DE.pdf • HyperAtlashttp://bill.cacr.caltech.edu/usvo-pubs/files/hyperatlas.pdf • Spectral • Simple Spectral Access Protocol • 500K spectra available athttp://voservices.net/wave • Query Services • ADQL and SkyNode http://skyservice.pha.jhu.edu/develop/vo/adql/ • Registry: • see below

  12. The Registry • UDDI seemed inappropriate • Complex • Irrelevant questions • Relevant questions missing • Evolved Dublin Core • Represent Datasets, Services, Portals • Needs to be machine readable • Federation (DNS model) • Push & Pull: register then harvest • http://www.ivoa.net/twiki/bin/view/IVOA/IvoaResReg

  13. SkyQueryA Prototype WWT • Started with SDSS data and schema • Imported about 9 other datasets into that spine schema. • Unified them with a portal • Implicit spatial join among the datasets. • All built on Web Services • Pure XML • Pure SOAP • Used .NET toolkit

  14. Demo • SkyServer: • navigator showing cutout web service • List: showing many calls and variant use. • SkyQuery: • Show integration of various archives. • Explain spatial join xMatch operator.

  15. MyDB • Portal allows federation of data but… • Intermediate results may be large. • Intermediate results feed into next analysis step. • Sending them back-and-forth to client is costly and sometimes infeasible. • Solution: create a working DB for client at Portal: MyDB

  16. MyDB • Anyone can create a personal DB at SkyServer portal. • It is about 100 MB • It is private • Simple queries done immediately • Complex queries done by batch scheduler • All queries can create/read/write MyDB tables • Very popular with “serious” users. • MyDB will be sharable with by a group.

  17. Open SkyQuery • SkyQuery being adopted by AstroGrid as reference implementation for OGSA-DAI(Open Grid Services Architecture, Data Access and Integration). • SkyNode basic archive objecthttp://www.ivoa.net/twiki/bin/view/IVOA/SkyNode • SkyQuery Language (VoQL) is evolving.http://www.ivoa.net/twiki/bin/view/IVOA/IvoaVOQL

  18. The WWT Components What we learned • Astro is a community of 10,000 • Homogenous & Cooperative • If you can’t do it for Astro, do not bother with 3M bio-info. • Agreement • Takes time • Takes endless meetings • Big problems are non-technical • Legacy is a big problem. • Plumbing and tools are thereBut… • What is the object model • What do you want to save. • How document provenance. Outline • Data Sources • Literature • Archives • Unified Definitions • Units, • Semantics/Concepts/Metrics, Representations, • Provenance • Object model • Classes and methods • Portals • WWT is a poster child for the Data Grid.

More Related