1 / 46

New Measures for Different Transients

You think you are a young star?. New Measures for Different Transients. Ashish Mahabal, Caltech aam at astro.caltech.edu 4th Gaia Science Alerts Workshop IAP, Paris, 19-21 June 2013. Collaborators. Gaia collaboration Lukasz Wyrzykowski Sergey Koposov , Gerry Gilmore, Simon Hodgkin

derex
Download Presentation

New Measures for Different Transients

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. You think you are a young star? New Measures for Different Transients Ashish Mahabal, Caltech aam at astro.caltech.edu 4th Gaia Science Alerts Workshop IAP, Paris, 19-21 June 2013

  2. Collaborators Gaia collaboration Lukasz Wyrzykowski Sergey Koposov, Gerry Gilmore, Simon Hodgkin SAMSI: Julian Faraway, Jiayang Sun, JogeshBabu, Lingsong Zhang, Grace Wang, Xiaofeng Wang • Caltech • Alex Ball • George Djorgovski • CiroDonalek • Andrew Drake • Matthew Graham • KartikSaxena • Roy Williams • JPL • Thomas Fuchs • Mike Turmon India: AjitKembhavi, Sajeeth Philip South Africa: SudhanshuBarway • Plus at various other institutes all over, but especially in US, India and Italy LSST Ashish Mahabal

  3. Overview • Objectives • Classification, in Real-time, using minimal data • Applicable to archives and follow-up prioritization • Challenges • Heterogeneity of data sources • CRTS (today ~10), LSST(soon ~10^6), minimal overlap (e.g. DASCH), part of larger set of parameters • Large and massive amount of light curves • Missing data, measurement errors and irregularly sampled data, …. • Data Sets • Lightcurves+, necessary bits from archives, images

  4. Data • Transients from CRTS • Mostly non-variables: Objects @ random locations • Brighter samples of CVs and RR-Lyrae – important for connecting datasets (e.g. many brighter CRTS objects will saturate LSST, just like almost all DASCH sources are saturated in CRTS) Soon: Gaia, and LSST simulations

  5. Data Characteristics Classifying (all) transients (in real time) is hard • Too many ‘ordinary’ transients • Finding needles in a hay stack • Too many possible ‘parameters’ • e.g. colors, positions, flux CRTS --> LSST

  6. Digital snapshots -> digital movies of the sky CRTS: PIs George/Andrew 500 M lightcurves (time series) available for analysis. Available soon also from IUCAA, India

  7. Challenge 1: Characterize/Classify as much with as little data as possible We concentrate here on lightcurves (time series)

  8. Challenge 2: Only a small fraction are rare* CRTS statistics as of Jun 2013: http://nesssi.cacr.caltech.edu/catalina/Stats.html • Current Status: • About 1 strong (but mostly ‘ordinary’) transient/106 sources by machine • High threshold to pick most dramatic transients (identification by humans) • Future: • With LSST, a million transients will be found per night, a good reason why we need automatic classification algorithms Ast/Flr SNe

  9. Challenge 3: A Variety of Parameters Not all parameters are always present leading to swiss-cheese like data sets • Discovery: magnitudes, delta-magnitudes • Contextual: • Distance to nearest star • Magnitude of the star • Color of that star • Normalized distance to nearest galaxy • Distance to nearest radio source • Flux of nearest radio source • Galactic latitude • Follow-up • Colors (g-r, r-I, i-z etc.) • Prior classifications (event type) • Characteristics from light-curve • Amplitude • Median buffer range percentage • Standard deviation • Stetson k • Flux percentile ratio mid80 • Prior outburst statistic http://ki-media.blogspot.com/ • New lightcurve-based parameters: • Whole curve measures • Fitted curve measures • Residual from fit measures • Cluster measures • Other

  10. Challenge 4: Lightcurve demonstrating upper limits

  11. Our Approaches • Methods (recall our objective: Classification): • 1. Modern EDA before classification on stats, lightcurvesin 1-d and high-d (graphical computation) • Improvement from 4 directions: • 1. Better with new derived statistics • 2. Better classification procedure (single, ensemble) • 3. Better with previously ignored information • ‘semi-supervised’ learning • 4. Better in terms of using less or incremental approach • Notes: Classification based on derived statistics or entire curve (2-4) • 3. Methodology Development

  12. EDA on Non-Variables

  13. EDA on a transient (change is sudden)

  14. EDA on a transient with changes over longer periods

  15. EDA on a sub-group: active galactic nuclei, which includes blazar

  16. SED of boxplots for CRTS flares CRTS, SDSS, 2MASS, WISE mags

  17. SED for CRTS flares in parallel coordinates

  18. SED for CRTS flares in parallel coordinates

  19. Derive new statistics • How? • Fit curves (by FDA, NP, Gaussian process modeling) • Functional Data Analysis: registration • Non-Parametric (Regression): incorporate known variances • GPM: use the known variances to build the prior • Residuals: • Variability, outliers/signals, … • Others

  20. Generation of new summary measures Modeled Curve Fitted Summary measures Residuals Summary measures Clusters of observations In 30 minute groups of 4 Summary measures

  21. (Old/)New Summary Statistics • Whole curve measures Median magnitude (mag); mean of absolute differences of successive observed magnitude; the maximum difference magnitudes • Fitted curve measures Scaled total variation scaled by number of days of observation; range of fitted curve; maximum derivative in the fitted curve • Residual from fit measures The maximum studentized residual; SD of residuals; skewness of residuals; Shapiro-Wilk statistic of residuals • Cluster measures Fit the means within the groups (up to 4 measurements); and then take the logged SD of the residuals from this fit; the max absolute residuals from this fit; total variation of curve based on group means scaled by range of observation • Other

  22. Relative significance of parameters Linear trend: sign(linear trend) × log(linear trend| + 1e−06) sign(linear trend) ×√{|linear trend|} med_buf_range_per: −log(1 − med_buf_range_per) Kurtosis: log(3 + kurtosis) Parameters from Richards et al.

  23. High dimensional views via modern graphics and PP

  24. Available Data with non-variables and 7 transient types Random split Training Set N=2480 Test Set N=1240 Percent correctly classified in the test set: Others: Multinomial logist DA + New Ensembles

  25. Whole Curve Comparisons • PfClust -> PfClassification • Functional Centroid Method(FCC) • Model m(x) and of the whole curves for each class • Develop Simultaneous Confidence Bands for each m(x) • Define a functional distance measure between curves • Classify a new curve to one of the existing classes or a new class of curves based on the distances

  26. Development of Functional Method Exploration Step: are they different and separable? • Directly estimate the (pair-wise) mean difference between classes • Bootstrap method to estimate the (point-wise) confidence intervals.

  27. Selective comparison

  28. Selective comparison

  29. Selective comparison

  30. Using domain knowledge effectively As part of the general theme of classifying transients and variables, it is important to separate SNe from non-SNe Simple parameters that can be used: • Distance to nearest star (and perhaps color) • Galaxy proximity (normalized) • Archival lightcurve (including with upper limits) Ashish Mahabal

  31. Proximity to a galaxy is useful in marking SNe • Often there is confusion when more than one galaxy is present nearby • As a result distance has to be normalized • Which radius to consider is also an important consideration • Misclassification of S/G possible, so look at nearest stars • Coincident stars will be a direct NO • Dwarf galaxies (low SNR – not catalogued – have to be considered) Only a small fraction have radio-counterparts – so that too is a NO (but if yes, its likely to be very interesting) Ashish Mahabal

  32. Definite SN – but in which galaxy? The transient (not seen) is at the center) Ashish Mahabal

  33. Separation of SNe and non-SNe normalized Based on peaks 80-90% completeness Ashish Mahabal

  34. R CorBor – has two distinct states Carbon rich with dust formation episodes Sumin Tang

  35. RR Lyrae and the Blahzko effect

  36. Period Changing RRL A Drake Neither Blazhko, nor RRd.

  37. Period changing RR Lyrae A Drake Split data in two parts. Two separate, discrete periods separated by 100 day gap. RRab -> RRc?

  38. Characterizing measures - I • Variability: • Abbe, von Neumann, Stetson J, K, L, reduced chi-square, Kendall • Morphology: • Wozniak (2000) consecutive statistic • Cumulative sum range • Moments: mean, variance, skew, kurtosis • Median absolute deviation • Thiel-Sen estimator of median slope • Periodicity: • Period by conditional entropy • Optimal Fourier Decomposition using CE (F-test) • Weighted wavelet z-transform around CE period • Autocorrelation: • Kendall τ statistic • Durbin-Watson statistic • Detrended fluctuation analysis (long memory or 1/f processes) • Hurst exponent (long term memory) • ZCDF: ACF(τ0) = 0, KACF • Granger Causality Analysis (temporal dependency) M Graham

  39. Characterizing measures - II • Processes: • Teraesvarta neural network test (nonlinearity) • Lyapunov exponent (chaos) • Slepian wavelet (characteristic timescale) • HMM: • Inhomogeneous mixture (AIC/BIC) • Continuous (AIC/BIC) • CAR(0) / CAR(p) (AIC) M Graham

  40. Summary • Developing new derived statistics • Early Characterizing needed for selecting rare ones • Allowing for incremental classification • Characterizing based on domain knowledge • Public datasets like CRTS/skydot important Gaia follow-up and LSST simulations to be linked soon

More Related