Figure 1 – Population Distribution of hot DB white dwarfs described by Eisenstein et al. 2006

AMERICAN ASTRONOMICAL SOCIETYContinuous Probability Distribution as an Alternative to Binning of Survey DataJANUARY 6, 2010David J. Corliss

16 14 12 10 8 6 4 2 0 30 - 35,000 K < 30,000 K 35 - 40,000 K 40 - 45,000 K > 45,000 K A Typical Example of Binned Data Population of Hot DB White Dwarfs in the Sloan Digital Sky Survey Figure 1 – Population Distribution of hot DB white dwarfs described by Eisenstein et al. 2006

Some Amount of Information if Lost as All Points in a Given Bin Are Treated the Same There is Also Some Uncertainty as to Which Bin a Given Point Belongs LOWER DB GAP MIDDLE DB GAP UPPER DB GAP Figure 2A – Population Distribution of hot DB white dwarfs described by Eisenstein et al. 2006b

Kernel Density Estimate (KDE) Process: Represent Each Point as a Normal and Sum Figure 2B – Population Distribution of hot DB white dwarfs described by Eisenstein et al. 2006 b

Summary and Conclusions: Kernel Density Estimation • Creates a Continuous Probability Density Distribution • by summing over Gaussian Distributions for Each • Data Point, Where μ is the Observed Value and σ is the • σ of the Individual Measurement. • Prevents Loss of Information From Relatively • Accurate Measurements Being Placed into Larger Bins • Incorporates the Uncertainty Associated with • Measured Values into Population Distributions • Provides a Viable Alternative to Binning in Developing • Population Distributions for Survey and Other Data

References Babu, G. Jogesh, Summer School in Statistics for Astronomers V lecture Notes, Pennsylvania State University 2009 Barnes, George R., Cerrito, Patricia B., The Visualization of Continuous Data Using PROC KDE and PROC CAPABILITY , SUGI, 26, 2001 Corliss, David J., MS Thesis, Wayne State University, 2008 Eisenstein, D.J., et al., 2006, ApJS, 167, 40 (Eisenstein et al. 2006a) Eisenstein, D.J., et al., 2006, ApJ, 132, 676 (Eisenstein et al. 2006b) Sall, John – Personal Communication re. the SAS KDE Procedure

A Final Thought - “Essentially, all models are wrong, but some are useful.” George E. P. Box (Norman R. Draper (1987). Empirical Model-Building and Response Surfaces, p. 424, Wiley.)

libname project 'C:\SAS\Conferences'; data work.kde; input month 4.0 day 4.0 year 4.0 volume 8.0; cards; 1 1 1962 589 2 1 1962 561 3 1 1962 640 4 1 1962 656 5 1 1962 727 6 1 1962 697 7 1 1962 640 8 1 1962 599 run; DATA WORK.TSERIES; SET WORK.CRYER; IF MONTH = 1; DUMMY = 1; ATTRIB T INFORMAT=8.0 FORMAT=8.0; T = YEAR; ATTRIB Y INFORMAT=8.0 FORMAT=8.1; Y = VOLUME; RUN; PROC MEANS DATA=WORK.TSERIES NOPRINT; VAR VOLUME; OUTPUT OUT=WORK.RANDOM_TERM; RUN; %GLOBAL LAMBDA SIGMA; %MACRO ASSIGNMENT; DATA _NULL_; SET WORK.RANDOM_TERM; IF _STAT_ = MEAN; %LET LAMBDA = VOLUME; RUN; DATA _NULL_; SET WORK.RANDOM_TERM; IF _STAT_ = STD; %LET SIGMA = VOLUME; RUN; %ASSIGNMENT; %PUT LAMBDA = &LAMBDA.; DATA WORK.TEST; SET WORK.TSERIES; LAMBDA = &LAMBDA.; SIGMA = &SIGMA.; RUN;

%MACRO AC(N); PROC SORT DATA=WORK.TSERIES; BY DUMMY; RUN; DATA WORK.LAST; SET WORK.TSERIES; BY DUMMY; IF LAST.DUMMY; RECENT = _N_ - &N. + 1; KEEP DUMMY RECENT; RUN; DATA WORK.RECENT; MERGE WORK.TSERIES WORK.LAST; BY DUMMY; IF _N_ GE RECENT; DROP RECENT; RUN; PROC REG DATA=WORK.RECENT NOPRINT; MODEL Y=T; OUTPUT OUT=WORK.TREND PREDICTED=FORECAST RESIDUAL=RESIDUAL; RUN; DATA WORK.TREND; SET WORK.TREND; OUTPUT; T_PREVIOUS = T; Y_PREVIOUS = FORECAST + RAND(SIGMA,LAMBDA); RETAIN T_PREVIOUS Y_PREVIOUS; RUN; DATA WORK.NEW; SET WORK.TREND; BY DUMMY; IF LAST.DUMMY; DELTA_T = T - T_PREVIOUS; T = T + DELTA_T; DELTA_Y = Y - Y_PREVIOUS + 1; Y = Y + DELTA_Y; KEEP T Y DUMMY; RUN; DATA WORK.TSERIES; SET WORK.TSERIES WORK.NEW; RUN; %MEND AC; %AC(5);

Figure 1 – Population Distribution of hot DB white dwarfs described by Eisenstein et al. 2006

Figure 1 – Population Distribution of hot DB white dwarfs described by Eisenstein et al. 2006

Presentation Transcript

Figure 2.1 Genealogy of common high-level programming languages

UPTAKE AND DISTRIBUTION OF VOLATILE ANESTHETICS

GCSE POPULATION

Chapter 18 The Bizarre Stellar Graveyard

The white house

Morocco

Population

POPULATION DISTRIBUTION AND ABUNDANCE

3.05 Channels of Distribution

PERITONAEAL DIALYSIS

The White House

Welcome to University of Illinois Extension Monroe County

Population

Figure 13.2 Two families

Basic BJT Amplifiers

2014 June Babysitting Plant Distribution

Probability Theory: Counting in Terms of Proportions

Stocks and Sauces

Merton ALG Residents Survey 2006/07

Chapter 52