400 likes | 571 Views
National Virtual Observatory. The National Virtual Observatory. National distributed in scope across institutions and agencies available to all astronomers and the public Virtual not tied to a single “brick-and-mortar” location
E N D
The National Virtual Observatory • National • distributed in scope across institutions and agencies • available to all astronomers and the public • Virtual • not tied to a single “brick-and-mortar” location • supports astronomical “observations” and discoveries via remote access to digital representations of the sky • Observatory • general purpose • access to large areas of the sky at multiple wavelengths • supports a wide range of astronomical explorations • enables discovery via new computational tools National Virtual Observatory
Why Now ? • The past decade has witnessed • a thousand-fold increase in computer speed • a dramatic decrease in the cost of computing & storage • a dramatic increase in access to broadly distributed data • large archives at multiple sites and high speed networks • significant increases in detector size and performance • These form the basis for science • of qualitatively different nature National Virtual Observatory
Trends • Future dominated by detector improvements • Moore’s Law growth in CCD capabilities • Gigapixel arrays on the horizon • Improvements in computing and storage will track growth in data volume • Investment in software is critical, and growing Total area of 3m+ telescopes in the world in m2, total number of CCD pixels in Megapix, as a function of time. Growth over 25 years is a factor of 30 in glass, 3000 in pixels. National Virtual Observatory
The Discovery Process Past:observations of small, carefully selected samples of objects in a narrow wavelength band • discover significant patterns • from the analysis of statistically rich and unbiased image/catalog databases • understand complex astrophysical systems • via confrontation between data and large numerical simulations Future: high quality, homogeneous multi-wavelength data on millions of objects, allowing us to The discovery process will rely heavily on advanced visualization and statistical analysis tools National Virtual Observatory
NVO Science: Discoveries • Discoveries of rare objects: • Searches for exotic new sources truly rare at level of 1 source in 10 million • Multi-wavelength identification of large statistical samples of previously rare objects: • brown dwarfs, high-z quasars, ultra-luminous IR galaxies, etc. • Efficient cross-identification of “unidentified sources” from new surveys • Example: Use radio, optical, and IR surveys to identify serendipitous Chandra X-ray sources • Selection of targets for spectroscopic follow-up National Virtual Observatory
NVO Science: Statistical Surveys • Homogeneous samples of typical objects • Mega-surveys: sample size not a problem any more • Statistical accuracy determined entirely by systematics • Multi-wavelength data enables accurate sample selection(evolution, rest-frame selection) • High Precision Astrophysics of Origins • Large scale structure of the universe • Galactic structure • Galaxy evolution • Active galaxies, galaxy clusters, ... • Stellar populations • Leading to New Astronomy National Virtual Observatory
New Astronomy – Different! • Systematic Data Exploration • will have a central role in the New Astronomy • Digital Archives of the Sky • will be the main access to data • Data “Avalanche” • the flood of Terabytes of data is already happening, whether we like it or not! • Transition to the new • may be organized or chaotic National Virtual Observatory
Ongoing Mega-Surveys MACHO 2MASS SDSS DPOSS GSC-II COBE MAP NVSS FIRST GALEX ROSAT OGLE, ... • Large number of new surveys • Multi-terabyte in size, 100 million objects or larger • Individual archives planned and under way • Multi-wavelength view of the sky • More than 13 wavelength coverage within 5 years • Impressive early discoveries • Finding exotic objects by unusual colors L,T dwarfs, high redshift quasars • Finding objects by time variability gravitational micro-lensing National Virtual Observatory
High Redshift Quasars • Several z>5 QSOs discovered by SDSSin the early test data National Virtual Observatory
Methane/T Dwarf SDSS T-dwarf (June 1999) • Discovery of several newobjects by SDSS & 2MASS National Virtual Observatory
DPOSS Discoveries National Virtual Observatory
New Neighbor of the Milky Way • Finding new galaxies by spatial clustering of red objects • New galaxy is about 30 million light years away • Larger than most of the spiral galaxies in the Messier Catalogue • Clearly visible in the 2MASS infrared image • Expect to find 1000’s of such galaxies with 2MASS Optical Infrared National Virtual Observatory
The Observatories • several multi-Terabyte databases • and further extensive catalogs of objects • NOAO/NRAO • 20% of the time on all its telescopes dedicated to major surveys using a wide range of telescope and instrumentation packages • The NASA Great Observatories • new opportunities for surveys, • combine mission-specific data with those from other missions and from the ground National Virtual Observatory
HST Data Archive Several Terabytes/year Already more retrieval than ingest! National Virtual Observatory
Some Proposed Surveys • Next Decade: New optimized “survey systems” • exploring new parameter space • Dark Matter Telescope • map the distribution of matter for z<1.5 from weak lensing, through deep, high quality images of galaxies • moving and variable objects through repetitive surveys • Spectroscopic Wide-Field Telescope • evolution of galaxies from z~4 to the present from star formation rates • determine chemical abundances and kinematics National Virtual Observatory
The Road to the NVO • The environment to exploit these huge sky surveys does not exist today! • 1 Terabyte at 10 Mbyte/s takes 1 day • Expect 100’s of intensive queries and 1000’s of casual queries per-day • Data will reside at multiple locations • Existing analysis tools do not scale to Terabyte data sets • Acute need in a few years, • it will not just happen a New Initiative is needed! National Virtual Observatory
NVO: A New Initiative • A new initiative is needed • to ensure an evolutionary, cost-effective transition • to maximize the impact of large current and future efforts • to create the necessary new standards in the community • to develop the software tools needed • to ensure that the astronomical community has the proper network and hardware infrastructure to carry out its science • The National Virtual Observatory • can be the catalyst of the “New Astronomy” National Virtual Observatory
The Goals of the NVO • Virtual observations of the sky in multiple wavelengths, by integrating all-sky Mega-surveys • Query the individual object catalogs and image databases thousands of times per day • Joint queries of the combined catalogs thousands of times per day • Enable discovery in these archives via new tools novel visualization techniques, supervised, unsupervised learning, advanced classification techniques National Virtual Observatory
NVO: The Challenges • Size of the archived data • 40,000 square degrees is 2 trillion pixels • One band: 4 Terabytes • Multi-wavelength: 10-100 Terabytes • Time dimension: few Petabytes • The development of • new archival methods • new analysis tools • Hardware requirements • Training the next generation National Virtual Observatory
Necessary Components • New archival methods • New analysis tools • New hardware requirements National Virtual Observatory
New Archival Methods • Structure and manage multi-TB (and soon PB) data archives, distributed across the continent • Rapid and transparent access to image/catalog databases across all wavelengths, via intelligent query agents • Efficient query and data retrieval by more than 10,000 scientists world-wide, with enhanced search operators (like spatial proximity) National Virtual Observatory
Examples: non-local queries • Find all objects within 1' which have more than two neighbors with u-g, g-r, i-K colors within 0.05m • Find all star-like objects within dm=0.2 of the colors of a quasar at 5.5<z<6.5, using all colors in all available catalogs • Find galaxies that are blended with a star, output the deblended magnitudes • Provide a list of moving objects consistent with an asteroid, based on all the surveys, estimate possible orbit parameters • Find binary stars where at least one of them has the colors of a white dwarf, within the error boxes of hard x-ray sources National Virtual Observatory
Examples: Today’s I/O rates • Reading a 1 TB data set • data access speed time [days] • Fast database server 50 MB/s 0.23 • Local SCSI/Fast Ethernet 10 MB/s 1.2 • T1 0.5 MB/s 23 • Typical ‘good’ www 20 KB/s 580 • Brute force is not enough – we need clever techniques National Virtual Observatory
Geometric Indexing Attributes Number Sky Position 3 Multiband Fluxes N = 5+ Other M= 100+ “Divide and Conquer” Partitioning 3NM HierarchicalTriangular Mesh Split as k-d treeStored as r-treeof bounding boxes Using regularindexing techniques National Virtual Observatory
Sky coordinates Stored as Cartesian coordinates: projected onto a unit sphere Longitude and Latitude lines: intersections of planes and the sphere Boolean combinations: query polyhedron National Virtual Observatory
Sky Partitioning Hierarchical Triangular Mesh - based on octahedron National Virtual Observatory
Hierarchical Subdivision Hierarchical subdivision of spherical triangles represented as a quadtree In SDSS the tree is 5 levels deep - 8192 triangles, In 2MASS the tree goes much deeper in the Galactic plane One shoe fits all… This indexing is now adopted by SDSS, 2MASS, GSC2, POSS2, FIRST and is considered by CDS, PLANCK and GAIA New standard spatial index for astronomy! National Virtual Observatory
Result of the Query National Virtual Observatory
New Analysis Tools • Discover new patterns through advanced statistical methods and visualization techniques • Confront catalogs and image databases with numerical simulations of astrophysical systems • Collaborative exploration of multi-wavelength databases by multiple groups working at remote sites National Virtual Observatory
New Hardware Requirements • Large distributed database engines with Gbyte/s aggregate I/O speed • High speed (>10 Gbits/s) backbones cross-connecting the major archives • Scalable computing environment with hundreds of CPUs for statistical analysis and discovery National Virtual Observatory
What is the NVO? - Content Source Catalogs,Image Data Specialized Data: Spectroscopy, Time Series, Polarization Information Archives: Derived & legacy data: NED,Simbad,ADS, etc Query Tools Analysis/Discovery Tools: Visualization, Statistics Standards National Virtual Observatory
What is the NVO? - Components Service Providers Query engines, Compute engines Data Providers Surveys, observatories, archives, SW repositories Information Providers e.g. ADS, NED, ... National Virtual Observatory
Conceptual Architecture User Discovery tools Analysis tools Gateway Data Archives National Virtual Observatory
NVO Layers Three layers built on top of another, tied together with standards Basic analysis tools • Query capabilities • Statistical tools • Ability to run user code (API) • Browsing tools • Discovery tools • Visualization • Advanced classification methods • Supervised/unsupervised learning • Data mining Standards • Meta-data • Interfaces between archives • Cross-identification standards • Archive-tool interfaces • Archives • Data content • Interconnections • Cross identifications • Services National Virtual Observatory
The Flavor/Role of the NVO • Highly Distributed and Decentralized • Multiple Phases, built on top of another • Establish standards, meta-data formats • Integrate main catalogs • Develop initial querying tools • Develop collaboration requirements, establish procedure to import new catalogs • Develop distributed analysis environment • Develop advanced visualization tools • Develop advanced querying tools National Virtual Observatory
NVO Development Functions • Software development • query generation/optimization, software agents, user interfaces, discovery tools, visualization tools • Standards development • Meta-data, meta-services, streaming formats, object relationships, object attributes,... • Infrastructure development • archival storage systems, query engines, compute servers, high speed connections of main centers • Train the Next Generation • train scientists equally at home in astronomy and modern computer science, statistics, visualization National Virtual Observatory
The Mission of the NVO • The National Virtual Observatory should • provide seamless integration of the digitally represented multi-wavelength sky • enable efficient simultaneous access to multi-Terabyte to Petabyte databases • develop and maintain tools to find patterns and discoveries contained within the large databases • develop and maintain tools to confront data with sophisticated numerical simulations National Virtual Observatory
NVO Funding The NVO is ideal for multi-agency and IT funding • relevant for all areas of astronomy and space science • excellent match to goals of the IT2 initiative • requires funding from NASA and NSF • needs serious involvement of computer scientists Scope • approximately $25M for the first 5 years, could be larger in the second half Requires long term commitment • development/deployment (5 + 5 years) Needs to start soon • data avalanche has already begun An effort for the whole astronomy - astrophysics community! National Virtual Observatory