An Introduction to Support Vector Machine Classification

1. An Introduction to Support Vector Machine Classification

2. Outline What do we mean with classification, why is it useful Machine learning- basic concept Support Vector Machines (SVM) Linear SVM � basic terminology and some formulas Non-linear SVM � the Kernel trick An example: Predicting protein subcellular location with SVM Performance measurments

3. Classification Everyday, all the time we classify things. Eg crossing the street: Is there a car coming? At what speed? How far is it to the other side? Classification: Safe to walk or not!!!

5. Classification tasks in Bioinformatics

6. Problems in classifying biological data Often high dimension of data. Hard to put up simple rules. Amount of data. Need automated ways to deal with the data. Use computers � data processing, statistical analysis, try to learn patterns from the data (Machine Learning)

8. Black box view ofMachine Learning

9. Tennis example 2

10. Linear Support Vector Machines

11. Linear SVM 2

12. Definitions

13. Maximizing the margin

14. The Lagrangian trick

15. Problems with linear SVM

16. Non-linear SVM 1

17. Non-linear svm2

18. Solving the optimization problem In many cases any general purpose optimization package that solves linearly constrained equations will do. Newtons� method Conjugate gradient descent Other methods involves nonlinear programming techniques.

19. Overtraining/overfitting

20. Overtraining/overfitting 2 Example with a gardener.Example with a gardener.

21. A practical example, protein localization Proteins are synthesized in the cytosol. Transported into different subcellular locations where they carry out their functions. Aim: To predict in what location a certain protein will end up!!!

22. Subcellular Locations

23. Method Hypothesis: The amino acid composition of proteins from different compartments should differ. Extract proteins with know subcellular location from SWISSPROT. Calculate the amino acid composition of the proteins. Try to differentiate between: cytosol, extracellular, mitochondria and nuclear by using SVM

24. Input encoding

25. Cross-validation

26. Performance measurments

27. Results We definetely get some predictive power out of our models. Seems to be a difference in composition of proteins from different subcellular locations. Another questions: What about nuclear proteins. Is there a difference between DNA-binding proteins and others???

28. Conclusions We have (hopefully) learned some basic concepts and terminology of SVM. We know about the risk of overtraining and how to put a measure on the risk of bad generalization. SVMs can be useful for example in predicting subcellular location of proteins.

29. You can�t input anything into a learning machine!!!

30. References

An Introduction to Support Vector Machine Classification

An Introduction to Support Vector Machine Classification

Presentation Transcript

Bayesian Support Vector Machine Classification

Support Vector Machine

An Introduction to Support Vector Machine Classification

An Introduction to Support Vector Machines

Support Vector Machine (SVM) Classification

Support vector machine

An Introduction to Support Vector Machines

Support Vector Machine & Image Classification Applications

An Introduction of Support Vector Machine

An Introduction to Support Vector Machines

Support Vector Machine (SVM) Classification

Support Vector Machine

Support Vector Machine

Question Classification using Support Vector Machine

Support Vector Machine

Classification: Support Vector Machine

An Introduction to Support Vector Machine Classification

An Introduction to Support Vector Machines

An introduction to support vector machine (SVM)

An Introduction to Support Vector Machine Classification

Support Vector Machine (SVM) Classification

Support Vector Machine

An Introduction to Support Vector Machine Classification