ACTIVE LEARNING USING CONFORMAL PREDICTORS: APPLICATION TO IMAGE CLASSIFICATION

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions ACTIVE LEARNING USING CONFORMAL PREDICTORS: APPLICATION TO IMAGE CLASSIFICATION L. Makili1, J. Vega2, S. Dormido-Canto3 1Universidade KatyavalaBwila. Benguela (Angola) 2Asociación EURATOM/CIEMAT para Fusión. Madrid (Spain) 3Universidad Nacional de Educación a Distancia (UNED). Madrid (Spain) 7th Workshop onFusion Data ProcessingValidationandAnalysis

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Outline • Introduction • Concepts overview • Experimental results • Conclusions

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Motivation • 5 – class classification problem • Classification of TJ – II Thomson Scattering images • Classifier based on conformal predictors, using SVM as the underlying algorithm • Computationally intensive task Patterns of TSD images: (a) BKGND, (b) COFF, (c) ECRH, (d) NBI and (e) STRAY

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Goal • To find out a minimal and good enough training dataset for classification purposes

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Active learning • The learning algorithm must have some control over the data from which it learns • It must be able to query an oracle, requesting for labels of data samples that seem to be most informative for the learning process • Proper selection of samples implies better performances with fewer data Settles, B. “Active Learning Literature Survey. Computer Sciences Technical Report 1648”, University of Wisconsin – Madison, 2009. Available at http://research.cs.wisc.edu/tech reports/2009/TR1648.pdf

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Uncertainty sampling • The learning algorithm selects new examples when their class membership is unclear • Suitable for classifiers that besides making classification decisions, estimates certainty of these decisions Lewis, D. and Gale, W., “A Sequential Algorithm for Training Text Classifiers”. In Proceedings of the ACM – SIGIR Conference on Research and Development in Information Retrieval, Croft, W. B. and van Rijbergen, C. J. (eds). New York: Springer – Verlag, 1994, pp. 3 – 12

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Conformal prediction • Permits complementation of predictions made by machine learning algorithms with some measures of reliability • Besides the label predicted for a new object, it outputs two additional values • Confidence • Credibility Vovk, V., Gammerman, A. and Shafer, G., Algorithmic Learning in a Random World, New York: Springer Science + Business Media, Inc., 2005

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Conformal prediction • Used as nonconformity scores the Lagrange multipliers computed during SVM training • Extended to a multiclass framework in a one-vs-rest approach

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Active learning algorithm • Inputs • Initial training set T, calibration set C, pool of candidate samples U • Selection tresholdτ, batchsizeβ • Train an initial classifier on T • While a stopping-criterion is not reached • Apply the current classifier to the pool of samples • Rank the samples in the pool using the uncertainty criterion • Select the top β examples whose certainty level fall under the selection threshold τ • Ask teacher to label the selected samples and add them to the training set • Train a new classifier on the expanded training set

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions (Un)Certainty criteria • Credibility: • Confidence: • Query-by-transduction: • Multicriteria: • Combination 1: • Combination 2: Ho, S. and Wechsler, H., “Query by Transduction”. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30 (9), 2008, pp. 1557 – 1571

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Setup • 1149 samples divided into • Initial training: 5 samples • Pool: 794 samples • Calibration set: 150 samples • Test set: 200 samples • Batch size: 25 • Selection treshold : 0.4 • Used with Qbt, Comb1 and Comb2

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Setup • For Each criterion in (Qbt, Comb1, Comb2) • For experiment = 1 : 10 • Select test set • Run active learning algorithm selecting 700 samples hybridly • For NumTrn = 50 : 50 : 700 • Train CP classifier on the first NumTrn samples • Aplly classifier to the test set • End for NumTrn • End For experiment • End For Each

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Results

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions Conclusions • Active learningwasapplied to theselectionof a minimal andgoodenough training dataset for classificationpurposes • Itallowsreachinghighersuccess rates andconfidence in predictionswithfewer data points, compared to therandomselectionofthe training set • Combiningmultiplecriteriawe can balance thetrade-offbetweensuccess rate andconfidenceofpredictionimprovement

HypIntroduction Hyp Conceptual overview HypExperimentsandresults HypConclusions ACTIVE LEARNING USING CONFORMAL PREDICTORS: APPLICATION TO IMAGE CLASSIFICATION THANK YOU 7th Workshop onFusion Data ProcessingValidationandAnalysis

ACTIVE LEARNING USING CONFORMAL PREDICTORS: APPLICATION TO IMAGE CLASSIFICATION