1 / 40

Anti-Learning

Anti-Learning. Adam Kowalczyk Statistical Machine Learning NICTA, Canberra (Adam.Kowalczyk@nicta.com.au). National ICT Australia Limited is funded and supported by:. 1. Anti-learning Elevated XOR Natural data Predicting Chemo-Radio-Therapy (CRT) response for Oesophageal Cancer

paniz
Download Presentation

Anti-Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Anti-Learning Adam Kowalczyk Statistical Machine Learning NICTA, Canberra (Adam.Kowalczyk@nicta.com.au) National ICT Australia Limited is funded and supported by: 1

  2. Anti-learning Elevated XOR Natural data Predicting Chemo-Radio-Therapy (CRT) response for Oesophageal Cancer Classifying Aryl Hydrocarbon Receptor genes Synthetic data High dimensional mimicry Conclusions Appendix: A Theory of Anti-learning Perfect anti-learning Class-symmetric kernels Overview

  3. Definition of anti-learning Systematically: Random guessing accuracy Off-training accuracy Training accuracy

  4. z y -1 +1 +1 x -1 Anti-learning in Low Dimensions -1 +1 +1 -1

  5. Anti-Learning Learning

  6. Area under Receiver Operating Characteristic (AROC) θ f Evaluation Measure 1 f 0.5 True Positive AROC( f ) 0 0 0.5 1 False Positive

  7. ? AROC Test Training 1 AROC Learning TP + 1 AROC 0 0 1 TP FN 1 0 0 1 FN + TP Anti-learning 0 0 1 FN Random: AROC = 0.5 Learning and anti-learning mode of supervised classification

  8. Anti-learning in Cancer Genomics

  9. From Oesophageal Cancer to machine learning challenge

  10. Random: AROC = 0.5 Learning and anti-learning mode of supervised classification Test Training 1 AROC Learning TP + 1 AROC 0 0 1 TP FN 1 0 0 1 FN + TP Anti-learning AROC 0 0 1 FN

  11. Anti-learning in Classification of Genes in Yeast

  12. KDD’02 task: identification of Aryl Hydrocarbon Receptor genes (AHR data)

  13. Anti-learning in AHR-data set from KDD Cup 2002 Average of 100 trials; random splits: training: test = 66% : 34%

  14. Single class SVM 38/84 training examples 1.3/2.8% of data used in ~14,000 dimensions - change or control Vogel- AI Insight - change KDD Cup 2002 Yeast Gene Regulation Prediction Taskhttp://www.biostat.wisc.edu/~craven/kddcup/task2.ppt

  15. Anti-learning in High Dimensional Approximation (Mimicry)

  16. high dimensional features Paradox of High Dimensional Mimicry • If detection is based of large number of features, • the imposters are samples from a distribution with the marginals perfectly matching distribution of individual features for a finite genuine sample, then • imposters are be perfectly detectable by ML-filters in the anti-learning mode

  17. Mimicry in High Dimensional Spaces

  18. d = 5000 = | nE| / |nX| Quality of mimicry d = 1000 = | nE| / |nX| Average of independent test for of 50 repeats

  19. : Formal result

  20. Proof idea 1:Geometry of the mimicry data Key Lemma:

  21. Proof idea 1: Geometry of the mimicry data

  22. Proof idea 2:

  23. Proof idea 2:

  24. Proof idea 2:

  25. Proof idea 3:kernel matrix

  26. Proof idea 4

  27. Theory of anti-learning

  28. Hadamard Matrix

  29. CS-kernels

  30. 1 Train ROCT Test ROCS-T True positive 1 False positive Perfect learning/anti-learning for CS-kernels Kowalczyk & Chapelle, ALT’ 05

  31. Perfect learning/anti-learning for CS-kernels Kowalczyk & Chapelle, ALT’ 05

  32. Perfect learning/anti-learning for CS-kernels

  33. Perfect learning/anti-learning for CS-kernels

  34. Perfect anti-learning theorem Kowalczyk & Smola, Conditions for Anti-Learning

  35. Anti-learning in classification of Hadamard dataset Kowalczyk & Smola, Conditions for Anti-Learning

  36. AHR data set from KDD Cup’02 Kowalczyk & Smola, Conditions for Anti-Learning Kowalczyk, Smola, submitted

  37. From Anti-learning to learning Class Symmetric CS– kernel case Kowalczyk & Chapelle, ALT’ 05

  38. More is not necessarily better! Perfect anti-learning :i.i.d.a learning curve n = 100, nRand = 1000 random AROC: mean ± std 2 0 1 4 5 3 nsamplesi.i.d. samples from the perfect anti-learning-set S

  39. Statistics and machine learning are indispensable components of forthcoming revolution in medical diagnostics based on genomic profiling High dimensionality of the data poses new challenges pushing statistical techniques into uncharted waters Challenges of biological data can stimulate novel directions of machine learning research Conclusions

  40. Telstra Bhavani Raskutti Peter MacCallum Cancer Centre David Bowtell Coung Duong Wayne Phillips MPI Cheng Soon Ong Olivier Chapelle NICTA Alex Smola Acknowledgements

More Related