GSC-II Classifications Oct 2000 Annual Meeting

GSC-II ClassificationsOct 2000 Annual Meeting V. Laidler G. Hawkins, R. White, R. Smart, A. Rosenberg, A. Spagna

Preliminary Classification Goal: Classify as well as possible to plate limit Metric: Minimize overall number of errors Procedure: • Use ranks to handle plate to plate variation • Match training population to sky population • OC1 oblique decision tree (Murthy et al) • Build several decision trees & let them vote • Classification categories star / nonstar / defect Classification / Laidler

Next Step Classification Goal: Provide reliable guide stars to V~19(?) Metric: Minimize contaminationof “stars” to Vlim while maintaining sufficient completenessfor adequate coverage Contamination: We called it a star but it’s really nonstellar Completeness: Everything that is really a star is called a star Classification / Laidler

Development Areas • Multi-plate weighted voting • Training set magnitude distribution • Training set sources • Classification categories • Classification features • Object selection Available In progress Future Classification / Laidler

Multi-plate Weighted Voting • Weights calculated empirically from percentages of misclassifications(NED, NPM, ~4 plates per survey) • Compensates for observed bias in classifier and breaks ties Classification / Laidler

MP weighted voting compared to Mendez Galaxy model • Current classification comes from a single plate • Multiplate weighted voting is straightforward DB operation Conservative star selection further reduces contamination; coverage remains adequate Classification / Laidler

Training set mag distribution:What happens to V < Vlim objects? • Optimized approach • have more dynamic range • contribute all the weight when counting errors Preliminary approach • occupy 20% of ranked hyperspace • are outnumbered when counting errors • contain the same classification bias as the sky • are free of classification bias Classification / Laidler

Training Set Sources Classification Categories • Decision trees can be improved by using training sets with smaller dispersion in parameter space • Catalog objects will likely provide cleaner, better separated populations • Galaxies and blends are different => reside in different areas of parameter space => individually constitute better defined populations than when combined • Galaxy / blend classifications are value added to the catalog Classification / Laidler

New training set • Magnitude balanced to F=17: bright only • Star/galaxy/blend classifications • Stars, galaxies from catalogsNED,NPM,CAMC,LCRS • Blends from deblender “parent” objects • 1200 objects XP330, XP853, XP005 b={48,41,28} Classification / Laidler

New training set: Compare to production classifier • “Above all, do no harm” • Visually examine objects that changed classifications Classification / Laidler

New training set: compare to external catalogs Significant improvement in magnitude range of training set • Extend training set: can we extend this performance to Vlim? • Possibly use star/galaxy/blend to Vlim, star/nonstar/defect below Classification / Laidler

The “curse of dimensionality” tells us that tree performance can be improved by reducing the number of features Edinburgh group has used two features specifically to separate blends from galaxies Current classification features Maximum Density Integrated Density Semimajor axis Semiminor axis Ellipticity Unweighted semimajor axis Unweighted semiminor axis Unweighted ellipticity 4 texture features 2 spike features 16 areas Future work: Classification features Classification / Laidler

Future work: Object Selection • Object selection can be considered an additional classification step • Select based on: • Blend status • Multi plate information • Probability • Select for functional or science goals: • Minimize contamination • Maximize completeness • Probability comes from leaf population • Final probability comes from averaging probabilities from each tree • Can we use probabilities to further optimize guide star selection? Classification / Laidler

What do the probabilities mean? • Do the probabilities measure the observed population? No. This is not unexpected. Decision trees are optimized to produce correct answers, not to produce accurate models of the probability function. • Do the probabilities indicate reliability? Yes. • Conclusion: We can use the probabilities to construct a “class quality” field, but should not take them at face value. Classification / Laidler

How to Improve a Classifier Classification / Laidler

Classification / Laidler

Using Ranks • Sort the objects in order by the raw feature • Assign a ranked feature based on position in the list Classification / Laidler

GSC-II Classifications Oct 2000 Annual Meeting

GSC-II Classifications Oct 2000 Annual Meeting

Presentation Transcript

ANNUAL MEETING

ANNUAL GENERAL MEETING NAVY FOUNDATION 04 OCT 2011 CHENNAI

2000 Annual Results

Annual Meeting

OGLA Annual Meeting Oct 13, 2012

Budged 2012 Annual meeting Vaduz 1. Oct 2012

Annual Report 2000

Picture: METEOSAT Oct 2000

Pearson plc Annual General Meeting 12 May 2000

P. Grannis; Oct. 27, 2000

Annual Conference 2000

Galactic studies with GSC-II

GSC-II Plate Processing Pipeline Status

AfCoP Annual Meeting: Mauritius 27-30 Oct., 2008

Meeting Oct 24

Budged 2012 Annual meeting Vaduz 1. Oct 2012

Oct 18 Meeting

Memory Technology Oct 5, 2000

Linking Oct 19, 2000

Pearson plc Annual General Meeting 12 May 2000

ANNUAL GENERAL MEETING NAVY FOUNDATION 04 OCT 2011 CHENNAI