1 / 22

AstroGrid Datacenters

AstroGrid Datacenters. AstroGrid Consortium Review Dec 2004 Martin Hill (AstroGrid@ROE). Outline . Challenge Approach Developed: Storepoints Describing data Query Language Status Versioning Software: Publisher’s AstroGrid Library. Problem Challenge Outline.

Download Presentation

AstroGrid Datacenters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AstroGrid Datacenters AstroGrid Consortium Review Dec 2004 Martin Hill (AstroGrid@ROE)

  2. Outline • Challenge • Approach • Developed: • Storepoints • Describing data • Query Language • Status • Versioning • Software: Publisher’s AstroGrid Library

  3. Problem Challenge Outline • Large datasets (to Petabytes) • So? • Distributed; Science comes from combining • Bandwidth rising slower than • No/few established suitable standards • FITS images/‘tables’. Ambiguous headers. Ambiguous subformat, eg spectra. • VOTable introduced. Ambiguous subformat eg spectra vs catalogue. Verbose. • No/few established common terms • Involves Scientists…

  4. Approach: ‘Publisher’s AstroGrid Library’ • General solution to: • Discover problems faced, accumulate solutions in software • Experimentally publish sets and types (not host). • Many smaller datasets owned by people without web skills (eg solar) so: • Need 'easy‘/’unskilled’ installation • Able to proxy; 3rd parties can publish data without requiring more work from owner (eg VizieR, Trace) • ‘Free’ website, range of standard interfaces • Danger: too general (any query against any dataset producing any results).

  5. Existing Solutions • Common task: publish RDBMs to web • Accumulated tools & skill-sets • No combined solution offering: • Standard interface (eg query language) • Scientific values (errors, units) • Spatial querying (common) • VO Metadata for query and results

  6. Developing Standards • Resource metadata • Query language (ADQL/s, ADQL/x) • Web interfaces • Working beyond standards •  Feeding research to IVOA • Parallel development • In the VO: eg Starlink, NVO, VizieR • External: SRB, Taverna, GridPP monitor • Convergence

  7. Protocols & Interfaces • Human – web pages • SOAP • Toolkit Incompatibilities • Streaming awkward (via Toolkits) • Longer term benefits? • ‘Raw Http post’ (eg servlets, CGI) • Simpler • More existing skills amongst Astronomers • Mixed (eg SIAP, SkyNode) •  Don’t Choose – Implement • Mix & Match, Plug & Play:

  8. Releasing • Deploy early – if temporarily • Independent & Integrated Access • Versioning: • Servers & clients, ie new clients can still use old servers, and new servers work with old clients. • Add and ‘deprecate’, don’t change • Delete intelligently • (Remove quickly unused i/fs, eg CEA if CEA upgrades, JSPs) • Need hosts… • Hosts need hardware • Publishers need to know their data

  9. Describing Data • Registry ‘Resource’ documents • IVO Tabular Sky Service • Units, UCDs • Solar vs Sky vs… • Images vs Catalogues • Concept extended for ‘RdmsMetadata’ • UCD1+ -> Dictionaries & Ontologies • Relationships (simple: errors) • Queryable • Mirrors vs Copies

  10. Query Language • SQL -> ADQL/xml • Defined common functions – CIRCLE & XMATCH (sky not solar) • Working on: • XQL • Units • Investigating: UCDs instead of columns • Cross-dataset querying

  11. Results • Query+Metadata+RawResults = VoResults • FITS vs VOTable vs HDF vs CSV vs HTML vs… •  All of them • Results -> queryable data -> inputs

  12. Data Analysis (Clive Page) • Faster  feasible • < 10^6s OK. 10^8 not… • Joins • Polar coordinate matches (+ HTM, HealPix). • Cross-match algorithms • Distributed queries • Breaking down query • Moving the right data • Combining the results

  13. Status • Readily available • Debugging; developer • Debugging; astronomer • Inform User

  14. Storepoints • No data persistence at PALs • Web server machines not data storage ones • Large result sets • No workspace, memory models, etc •  Streaming outputs • SRB, GridFTP not ready.

  15. Identifying Storepoints • Concepts MySpace Community HomeSpace SRB FTP FTP VoSpace (Registered) SRB GridFTP MySpace SRB GridFTP HTTP •  FTP, File, MySpace + extend. • 3rd iteration; 2nd in use

  16. Data Service Architecture JSP SIAP CEA Axis AstroGrid SkyNode Plugin Manager Cone Datacenter Implementation Slinger /XML/CSV zip/plain email/file/ftp/myspace

  17. Publishers’ AstroGrid Library • ‘Easy to publish to the VO’ • Web Application, includes: • SOAP (AstroGrid, CEA, prepped for SkyNode) • CGI (SIAP, NVO-cone search, SSA) • HTML pages (cone search, query builder, status monitor) • Features • Asynchronous (‘stateful’) & Synchronous Queries • Queues • Comprehensive Status (incl historical) • Variety results • Fully ‘Streamed’ – no curation issues • Server ‘Plugins’, including: • RDBMS (JDBC) • FITS file collection • eXist (XML) • Helper Tools • Metadata Generators • Ready-made website access

  18. Situation Now • Installed: • SuperCOSMOS Science Archive (RDBMS) • astrogrid.roe.ac.uk:8080/pal-ssa/ • astrogrid.roe.ac.uk:8080/pal-twomass/ • astrogrid.roe.ac.uk:8080/pal-usnob/ • 6dF – Spectra • grendel12.roe.ac.uk:8080/pal-6df/ • Wide Field Survey • TRACE (FITS files, Solar, under test) • Proxy (bespoke special plugins) • All NVO-cone-compatible DBs (test) • VizieR • Evaluated/ing at: • ESO • RAL (solar) • JBO (Merlin) • Reviewing Query Language, metadata documents, etc

  19. Future • Quality… • Metadata ‘wizards’ • Sell to hosts; deploy to Leicester, JBO, ESO, RAL, The World.... • Explicit and Investigative Queries • Distributed queries & combining results (NVO Exec plans) • Full SIA, SSA interface • More user & admin web pages • Local authorisation

More Related