GSC2.2 Classification

GSC2.2 Classification GSC II Annual Meeting October 2001

Single Plate Classification Decision tree classifier: • Use ranks to handle plate to plate variation • 5000+ objects in training set • OC1 oblique decision tree (Murthy et al) • Build several decision trees & let them vote • Classification categories star / nonstar / defect

GSC2.2 Classification Unlike astrometry and photometry, where one best value was selected per object (per bandpass), GSC2.2 classification can combine multiplateinformation to improve the final classifications, And counter some known weaknesses.

MultiPlate Voting For each object: • Collect all single-plate measurements • Even from plates not being exported, eg IV-N • Override defect->nonstar if N(obs)>1 • Matched objects likely to be real objects • Eliminate 25um scan data, if 15um data exist • Classifier poorly tuned for these scans • Majority vote of remaining measurements • Voting classifiers is known to improve results • Break ties in favor of nonstars • Compensates for known bias

Auxiliary Information: the Source Status Flag • GSC2.2 provides a wealth of additional information about each object via the source status flag. • Much of this information is pertinent to the quality of the final classification. • Informed users can further optimize their results(eg, guide star selection) with this auxiliary data.

Status Flag Details: 0987654321 10 digit decimal mask with relevant info Columns 0: blend status 9: incomplete processing 8: classification voters 7: classification unanimity 654: photometric details (V,J,F) 3: centroider details 21: number of plate observations

Classification and the Status Flag 0: blend status • Poorly tuned for blends => lower confidence 9: incomplete processing • No features computed => lower confidence 8: classification voters • Multiple voters => higher confidence • 25um voters => lower confidence

Classification and the Status Flag 7: classification unanimity • Unanimous vote => higher confidence 654: photometric details (V,F,J) 3: centroider details 21: number of plate observations • More voters => higher confidence

Bright Objects • Tycho stars are included in the GSC2.2. • Classification was set to star for these objects • Status flag = 9999999900 for Tycho stars • GSC1 data were omitted from the GSC2.2 • Classifications were excluded from voting • GSC1 classifier superior for m<14 • Include GSC1 classification in next export

Evaluating Performance: Not a simple problem • What to measure? • Correctness; completeness; contamination • Magnitude and latitude variations • What to compare against? • GSCII was constructed because there is nothing comparable to it! • Nonstar <> galaxy • Automatically classified samples are less reliable • Visually classified samples are few and small

NPM/SPM Starsvs magnitude & latitude

NPM/SPM Galaxiesvs magnitude & latitude

SDSS Stars and Galaxiesvs magnitude

Accuracy vs the real questions • How complete is my sample of nonstars? • How pure is my sample of stars? • What is the probability that the GSC2.2 classification of this object is correct? The answers depend on your sample, as well as on the properties of the catalog. A single quoted accuracy does not suffice.

Accuracy vs the real questions P(Ts|S) = [P(S|Ts)*P(Ts)] / P(S) This formulation is: • Responsive to magnitude and latitude variations • Adaptable to a priori effects of sampling • Adaptable to your favorite galaxy model • Computable (we think! - in progress) • Answers the real questions.

GSC2.2 Classification

GSC2.2 Classification

Presentation Transcript

Classification

Classification

Classification

Classification

Classification

Classification

Classification

Classification

CLASSIFICATION

The construction of the GSC2.2 Catalog

Classification Techniques: Bayesian Classification

CLASSIFICATION

Classification

Classification

Classification

Classification

Classification