80 likes | 217 Views
Big Data. Steven Gollmer Cedarville University. Working with Large Data. Accessing data Collection and calibration assumptions Selecting appropriate parameters Formatting Calculation Testing hypothesis. Hipparcos Space Astrometry. Main Page
E N D
Big Data Steven Gollmer Cedarville University
Working with Large Data • Accessing data • Collection and calibration assumptions • Selecting appropriate parameters • Formatting • Calculation • Testing hypothesis
Hipparcos Space Astrometry • Main Page • http://www.rssd.esa.int/index.php?project=HIPPARCOS • Data Catalogues • http://www.rssd.esa.int/index.php?project=HIPPARCOS&page=Overview • http://cdsweb.u-strasbg.fr/ • Software • Desktop - http://www.rssd.esa.int/index.php?project=HIPPARCOS&page=Celestia2000 • Search tool - http://www.rssd.esa.int/index.php?project=HIPPARCOS&page=multisearch2 • Data Format • Flexible Image Transport System (FITS) - http://fits.gsfc.nasa.gov/
Sloan Digital Sky Survey • Main Page • http://www.sdss.org/ • Data • 9th Data Release - http://www.sdss3.org/dr9/ • Archive Server - http://dr9.sdss3.org/ • Software • IDL - http://www.sdss3.org/dr9/software/
Weather Data • NOAA National Climatic Data Center • http://www.ncdc.noaa.gov/ • Popular Data - http://www.ncdc.noaa.gov/most-popular-data • Environmental Modeling Center • http://www.emc.ncep.noaa.gov/
TERRA/AQUA • http://terra.nasa.gov • http://aqua.nasa.gov • Data • LARC DAAC - http://eosweb.larc.nasa.gov/ • LAADS Web - http://ladsweb.nascom.nasa.gov/index.html • Format • NetCDF - http://www.unidata.ucar.edu/software/netcdf/ • HDF - http://www.hdfgroup.org/
Other Topics of Interest • Topics of Interest • Extra-Solar Planets • Asteroid Mapping and Near Earth Detection • Earthquakes • Agencies and Products • NASA - http://www.nasa.gov/home/index.html • ESA - http://www.esa.int/ESA • USGS - http://www.usgs.gov/ • GOES - http://www.goes.noaa.gov/ • Paleoclimatology - http://www.ncdc.noaa.gov/paleo/pubs/pcn/pcn-proxy.html
Hypothesis Testing • P-value • Probability of a value being found assuming the null hypothesis. • Usually reject the null hypothesis if p < 0.05 or 0.01 (5% or 1%) • May have more stringent criteria for rejection. • T-test • Assume a normal distribution • One-sample test • Two-sample test • Check significance using T distribution table • If number of samples is large, then z-test will work on one-sample test • erf(x)= • One Tail: z=1/2(1+erf(x/) Two Tail: z=erf(x/)