1 / 23

Toward an Intelligent Event Broker: Automated Transient Classification

Toward an Intelligent Event Broker: Automated Transient Classification. Przemek Wozniak. Nov 15, 2013. Transient Classification and Follow-up Optimization work at LANL. RAPTOR/Thinking Telescopes network Active machine learning team

viet
Download Presentation

Toward an Intelligent Event Broker: Automated Transient Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Toward an Intelligent Event Broker:Automated Transient Classification Przemek Wozniak Nov 15, 2013

  2. Transient Classification and Follow-up Optimization work at LANL • RAPTOR/Thinking Telescopes network • Active machine learning team • LANL is a member iPTF consortium, collaborations with JPL and UC Berkeley • LSST Transient Science Working Group

  3. E Event Broker Architecture VOEvent Clients VOAgent Clients Classification Chain Follow-up Chain VOEvent Client Registry VOAgent Client Registry VOEvent Submission VOProfile Submission Alert Distribution Bid Distribution Event DB Event Translation Coalition Manager Event DB Event DB Transient Classification Follow-up optimization External DBs Transient DB Coalition DB Context DB Context Builder

  4. Real/Bogus Classification for iPTF Computer vision approach: feed pixels directly into classifiers REAL BOGUS

  5. Sparse representations withlearned dictionaries L = sparsity factor Iterate C-times over all training data until dictionary convergence Sparse coding for training vector Batch dictionary update – given dictionary and weights

  6. SparseReps algorithm Input data consists of 21 x 21 pix postage stamps from image triples (new, ref, sub): 3 x 441 numbers per candidate Learn 6 dictionaries: R/B x (new, ref, sub) K=900 dictionary elements (2x overcomplete), L=8 sparsity Present the same data 10 times in random order Dictionary imprinting (initialize with a random selection of inputs) For each candidate calculate nearest subspace distance to each dictionary to use as 6 new features for a higher level classifier

  7. Imprinted dictionaries:even learning, normalized data REAL BOGUS REFERENCE NEW IMAGE DIFFERENCE

  8. RB3 features and training Training data based on RB2 (Brink et al. 2012): 77811 candidates (14681 real + 63130 bogus) Features from RB2: ccid, seeingnew, extracted, l1, b_image, flag, mag_ref, mag_ref_err, a_ref, b_ref, flux_ratio, ellipticity_ref, nn_dist_renorm, maglim, normalized_fwhm_ref, good_cand_density, min_distance_to_edge_in_new, gauss, amp Append new SparseReps features: nsd_new_real, nsd_new_bogus, nsd_ref_real, nsd_ref_bogus, nsd_sub_real, nsd_sub_bogus

  9. Real/Bogus version 3.x Two codes: SparseReps+ RB3 Python: NumPy/SciPy + sklearn Use SparseReps to learn new features Append ML features to “expert” features Use random forest for high level classification Running on database at NERSC (no coadds) Cronjob during day time with is_processing_flag veto Requires a four-way SQL table join: candidate, subtraction, image, features Introduces two new tables: sparse_reps_features and realbogus

  10. RB3 deployment Running at NERSC since July 1, 2013 55million candidates processed as of Nov 14 SparseReps 6000 cand/min + RandomForest1e5 cand/min Scores available through a web interface

  11. RB3 interface

  12. RB3 performance: ROC curve

  13. Examples of RB3 transients

  14. Classification of Supernova Spectra from SEDMachine • Train and test on simulated data sets based on SNID database (Blondin & Tonry2007) • Randomly draw objects from SN types, interpolate age, degrade resolution to R=100 between 380 and 920 nm, inject noise • Take flux at each wavelength as a feature (100-dimensional feature space). Use Support Vector Machines to classify. • Track accuracy and ability to detect young objects as a function of S/N ratio and number of classes.

  15. Cross-validation accuracy Types Ia, Ib, Ic, II Types Ia, Ib, Ic, IIP, IIL, IIn, IIb

  16. Classifying noisy spectra Cross-validation results Need ~100 spectra per class with 10 < SNR < 100 to train a classifier capable of 90% accuracy Need SNR > 15 per pixel to classify a spectrum with 90% or better accuracy Preprocessing (redshift correction, continuum normalization) will be less reliable at low S/N Bin in age and treat young SNe as a separate class

  17. Confusion matrix: 4 classes N=300, SNR=3, acc=59.6% N=30, SNR=3, acc=64.5% N=300, SNR=50, acc=97.1% N=30, SNR=50, acc=85.5%

  18. Confusion matrix: 7 classes N=30, SNR=3, acc=54.9% N=300, SNR=3, acc=50.1% N=30, SNR=50, acc=87.6% N=300, SNR=50, acc=98.0%

  19. Estimating the age of SN Ia

  20. Select young SNe <10 days before the peak N=30, SNR=5, acc=70.1% N=300, SNR=5, acc=68.1% N=30, SNR=50, acc=86.8% N=300, SNR=50, acc=93.9%

  21. Classifying SN Ia subtypes SN1991T: overluminous, no Si II, Ca II early on SN1991bg: very underluminous, no secondary max in I band, Ti II early on, Na I D after max, early nebular phase CSM: circum-stellar medium signature? PEC: peculiar

  22. Summary • First iteration of event broker architecture and prototyping • Computer vision approach to Real/Bogus classification and successful application of sparse representations with learned dictionaries to gain ~2x better performance • RB3.x deployed at NERSC supercomputing center • SVMs can deliver accurate spectral classification of supernovae observed with SED Machine and support selection of young events • Promising results with recognizing SNIa anomalies

  23. Event Broker: distributed processing Data Processing Engine Input From Previous Stage TCP Server Main Processing Loop Incoming Job Queue Outgoing Job Queue Output To Next Stage TCP Client

More Related