760 likes | 1.65k Views
Scientific Databases Lecture: Hubble Space Telescope Science Databases. Dr. Kirk Borne, GMU SCS November 11, 2003 GMU CSI 710. Outline. Introduction to the Information Age Data Mining - a target application area for scientific databases Hubble Space Telescope (HST) HST Databases
E N D
Scientific Databases Lecture:Hubble Space Telescope Science Databases Dr. Kirk Borne, GMU SCS November 11, 2003 GMU CSI 710
Outline • Introduction to the Information Age • Data Mining - a target application area for scientific databases • Hubble Space Telescope (HST) • HST Databases • HST Science Data Archive • Multi-mission Archive at Space Telescope (MAST) Hubble Space Telescope Databases
The Information Age Hubble Space Telescope Databases
The Information Age is Here! • "Data doubles about every year, but useful information seems to be decreasing." • Margaret Dunham, "Data Mining Techniques & Algorithms", 2002 • "There is a growing gap between the generation of data and our understanding of it." • Witten & Frank, "Data Mining: Practical Machine Learning Tools", 1999 • "The trouble with facts is that there are so many of them" • Samuel McChord Crothers, "The Gentle Reader", 1973 • "Get your facts first, and then you can distort them as much as you please." • Mark Twain Hubble Space Telescope Databases
Characteristics of The Information Age: • Data “Avalanche” • the flood of Terabytes of data is already happening, whether we like it or not • our present techniques of handling these data do not scale well with data volume • Distributed Digital Archives • will be the main access to data • will need to handle hundreds to thousands of queries per day • Systematic Data Exploration and Data Mining • will have a central role • statistical analysis of “typical” events • automated search for “rare” events Hubble Space Telescope Databases
Data Mining Application: Outlier Detection Figure: The clustering of data clouds (dc#) within a multidimensional parameter space (p#). Such a mapping can be used to search for and identify clusters, voids, outliers, one-of-kinds, relationships, and associations among arbitrary parameters in a database (or among various parameters in geographically distributed databases). Hubble Space Telescope Databases
Data Mining = A Target Application Area for Scientific Databases http://nvo.gsfc.nasa.gov/nvo_datamining.html http://nvo.gsfc.nasa.gov/nvo_datamining.html Hubble Space Telescope Databases
What is Data Mining?Here is one idea … Hubble Space Telescope Databases
What is Data Mining? • Data mining is defined as “an information extraction activity whose goal is to discover hidden facts contained in (large) databases." • Data mining is used to find patterns and relationships in data. (EDA = Exploratory Data Analysis) • Patterns can be analyzed via 2 types of models: • Descriptive : Describe patterns and to create meaningful subgroups or clusters. • Predictive : Forecast explicit values, based upon patterns in known results. • How does this apply to Scientific Research? … • through KNOWLEDGE DISCOVERY Data Information Knowledge Understanding / Wisdom! Hubble Space Telescope Databases
Some words of wisdom • "We have confused information (of which there is too much) with ideas (of which there are too few)." • Paul Theroux • "The great Information Age is really an explosion of non-information; it is an explosion of data ... it is imperative to distinguish between the two; information is that which leads to understanding." • R.S. Wurman in his book: Information Anxiety2 Hubble Space Telescope Databases
Scientific data have a purpose … Data Information Knowledge Understanding / Wisdom! • EXAMPLE : • Data = 00100100111010100111100 (stored in database) • Information = ages and heights of children (metadata) • Knowledge = the older children tend to be taller • Understanding = children’s bones grow as they get older Hubble Space Telescope Databases
Astronomy Example Data: (a) Imaging data (ones & zeroes) (b) Spectral data (ones & zeroes) Information (catalogs / databases): • Measure brightness of galaxies from image (e.g., 14.2 or 21.7) • Measure redshift of galaxies from spectrum (e.g., 0.0167 or 0.346) Knowledge: Hubble Diagram Redshift-Brightness Correlation Redshift = Distance Understanding: the Universe is expanding!! Hubble Space Telescope Databases
What is the Goal ofBuilding and Maintaining Scientific Databases? • The end goal is not the data themselves, but the new knowledge and understanding that are revealed through the analysis of the data. • This is why the Data Mining research field is usually referred to asKDD = Knowledge Discovery in Databases. Hubble Space Telescope Databases
The Hubble Space Telescope (HST)http://www.stsci.edu/ Hubble Space Telescope Databases
HST satellite architecture Hubble Space Telescope Databases
HST focal plane layout Hubble Space Telescope Databases
HST Scientific Instruments • 1990: WFPC, FOC, FOS, GHRS, HSP, FGS • 1993: WFPC2, COSTAR(removed WFPC, HSP) • 1997: NICMOS, STIS(removed FOS, GHRS) • 1999: 1 of 3 FGS sensors and all 6 gyros were replaced • 2002: ACS, NICMOS cryocooler upgrade(removed FOC) • 2004(?): COS, WF3 (will remove WFPC2, COSTAR) • Cameras • Spectrometers • Photometer • Fine Guidance Sensor • Optical Path Correction Device More details at:http://www.ess.sunysb.edu/fwalter/AST443/hst.html Hubble Space Telescope Databases
The Nature of Astronomical Data • Imaging • 2D map of the sky at multiple wavelengths • Derived catalogs • subsequent processing of images • extracting object parameters (400+ per object) • Spectroscopic follow-up • spectra: more detailed object properties • clues to physical state and formation history • lead to distances: 3D maps • Numerical simulations • All inter-related! Hubble Space Telescope Databases
Derived data from images: tables of numbers, that can be plotted to study correlations Hubble Space Telescope Databases
The Electromagnetic Spectrum • Radiation is the Astronomer’s only source of information about the Universe! • And it is a remarkably rich & diverse source! Hubble Space Telescope Databases
Need Multi-Wavelength Science Instruments to Observe a Multi-Wavelength Universe Hubble Space Telescope Databases
NASA Astronomy Mission Data:the tip of the data mountain NSSDC’s astrophysics data holdings: One of many science data collections for astronomy across the US and the world! NSSDC = National Space Science Data Center @ NASA/GSFC Hubble Space Telescope Databases http://nssdc.gsfc.nasa.gov/astro/astrolist.html
Why so many Telescopes? Hubble Space Telescope Databases
Why so many Telescopes? … Because … • Many great astronomical • discoveries have come • from inter-comparisons • of various wavelengths: • Quasars • Gamma-ray bursts • Ultraluminous IR galaxies • X-ray black-hole binaries • Radio galaxies • . . . Overlay Hubble Space Telescope Databases
Therefore, our science data archive systems should enable multi-wavelength interdisciplinary distributed database access, discovery, mining, and analysis. Hubble Space Telescope Databases
So what wavelengths does HST observe? Range of 101 in λ Range of >1016 in λ Full Electromagnetic Spectrum HST Hubble Space Telescope Databases
Where has HST looked? Hubble Space Telescope Databases
HST’s cameras have very small field-of-view 3o HST Hubble Space Telescope Databases
Edwin Hubblemeasured distances to galaxies, and thereby discovered expansion of the Universe. The #1 goal of HST: to measure the expansion rate of the Universe to within 10% uncertainty. Previously, it was not known to within a factor of 2 = typical astronomical accuracy, but definitely not good enough. Hubble Space Telescope Databases
Henrietta Leavitt measured brightness variations of 1000’s of stars –the basis for the distance scale of the Universe The Cephus Constellation: “The King” Hubble Space Telescope Databases
Variable Star Data Examples • Periodic -- sinusoidal: • Periodic -- smooth non-sine: • Periodic -- spiked events: • Aperiodic events: • Single spiked events: • Single long-duration events: (Chirp) Hubble Space Telescope Databases
Real Cepheid variable star data.Note the characteristic light curve shape – a rapid rise, and then slow decline … Hubble Space Telescope Databases
Cepheid Variables = Cosmic Yardsticks Period-Luminosity Relation – shows 2 types of Cepheid Variables – notice the 2 bands in this correlation plot. We need to know which Cepheid type to assign to a given star in order to get the star’s distance right! The most famous example is Polaris = The North Star. Hubble Space Telescope Databases
Cepheids: just one step in the Cosmic Distance Scale Ladder PNLF Hubble Space Telescope Databases
HST reaches its goal!Determines expansion rate to within 10%, and age of Universe = 14 billion yrs Hubble Space Telescope Databases
But, HST almost didn’t get it right at all !Why? … well… something about a mirror problem.Bad news early in 1990. This is HST’s first-light image -- not too impressive. This should have told us that things were less than expected. Note that the left and right images are not particularly different in image resolution quality. Hubble Space Telescope Databases
HST should have much better image resolution.Resolution is measured in arcseconds.1 degree = 60 arcminutes = 3600 arcsecondsNote that the moon is ½ degree (30 arcmin) on the sky. Hubble Space Telescope Databases
HST image is better, but not dramatically … and not even particularly scientifically new. Ground Telescope image HST image Hubble Space Telescope Databases
COSTAR installed in December 1993.So let us compare before and after images. PLUTO and its moon BEFORE REPAIR (1990) Hubble Space Telescope Databases
AFTER Optical Repair (1994) Can you notice any difference from previous slide? Pluto’s moon Charon Pluto Hubble Space Telescope Databases
Here is the real comparison test :Before and After images of a single star! Hubble Space Telescope Databases
Software fixes • Before COSTAR was installed in 1993: • Image restoration (deconvolution) was needed. • One of the image restoration algorithms was later used on a regular basis for the analysis of medical images in potential cancer patients (mammograms). • To design and build COSTAR, an exact mapping of the image distortion characteristics had to be derived from long and numerous HST images of star fields … for each science instrument (S.I.) and each mode of that S.I. … the design of the telescope architecture then became important for the design of the science database and data analysis systems. • All new science instruments now include this optical correction within their design. • Users of the Science Data Archive need database info to track the condition of each image; and need image processing tools to correct pre-COSTAR images. Hubble Space Telescope Databases
“Before Repair” images of a Globular Cluster.(note how the smeared images of single stars overlapand therefore ruin any chance of studying individual stars in this massive pile of 100,000 stars) Hubble Space Telescope Databases
TheAFTER IMAGEof a globular clusterNeed I say more?? Hubble Space Telescope Databases
Okay, I will say more …Individual White Dwarf Stars were identified and discoveredfor the first time ever in Globular Clusters, as predicted by stellar evolution theories since the 1930’s. Hubble Space Telescope Databases
Therefore, the mirror flaw is what could have prevented HST from fulfilling its #1 goal. Hubble Space Telescope Databases
But there is so much more – Here are a few of the “big impact” HST science results! • Hubble expansion rate & age of Universe • Super star clusters in merging galaxies • Massive black holes in every(?) galaxy • Quasar host galaxies revealed • Protoplanetary disks found and studied • Starbirth unveiled and mapped in exquisite detail • Supernovae and novae shells resolved • Hierarchical evolution of galaxies proven • Most distant galaxies ever seen • Storms on planets • Kuiper belt comets found • Outflows from young stars • Gamma-Ray Burst (GRB) sources solved, at last! Hubble Space Telescope Databases
HST: 1990-2010, and beyond?Already a rich legacy of spectacular images & discoveries. Hubble Space Telescope Databases
What comes next? ... The JWST • The Next-Generation Space Telescope is now named the James Webb Space Telescope (JWST): launch in 2011? • If HST has shown the first galaxies, then JWST will see the first stars (“first light in the Universe”) • JWST will include some on-board processing (controversial) Hubble Space Telescope Databases
HST Databases Hubble Space Telescope Databases