Intro to Machine Learning

Intro to Machine Learning • Mark Stamp Introduction to Machine Learning

What is Machine Learning? Introduction to Machine Learning • Definition of machine learning (ML)? • Our working definition is… • Statistical discrimination, wherethe “machine” does the hard work (“learns”) • So, we humans don’t have to think too much • Often associated with AI • But actually much more widely applicable • Usually, based on a binary classifier • Often said to generate “data driven” models

What Can MLDo for Me? Introduction to Machine Learning • Machine learning is very powerful • Practical and useful • Successfully applied to problems in… • Speech recognition and NLP • bioinformatics • stock market analysis • AI (robotics, computer vision, etc.) • More and more and more uses all the time

Black Box Approach Introduction to Machine Learning • Machine learning (ML) algorithm often treated as a black box • This is one of ML’s main selling points! • ML black box often works well • You can get good results even if you know nothing about underlying algorithms • But, this can be limiting • Especially innew and novel applications

Analogy to a Doctor Introduction to Machine Learning NP is a nurse with advanced training Physician has much more education Both diagnose, treat, and manage patients’ problems Studies show that NPs can do about 80% to 90% of what physicians do But for the most challenging 10% to 20% of cases, a physician is required

Interesting, But What Does This Have to Do with ML? Introduction to Machine Learning • ML version of NP would have knowledge beyond black box, but not too much • ML version of physician would really understand how/why things work • Goal is for you to become ML physician • For doctors, most challenging 10% to 20% of cases are most interesting… • …and the most lucrative!

Auto Mechanic Analogy Introduction to Machine Learning • The majority of diagnosis work done by auto mechanics is routine • Often easy to see what the problem is (not necessarilyeasy to fix the problem) • But, there are some difficult cases • Where no “cookbook” diagnosis will work • Skill needed to analyze such problems • Requires deep understanding of inner workings of engine and related systems

ML from 10,000 Feet Introduction to Machine Learning • Usually focus on binary classification • First, we train a model on set of samplesof, say, “type A” • Then given sample of unknown type • Score the sample against the model • If it scores high, classify it as “type A” • Otherwise, classify it as “not type A” • Key ideasare training and scoring

Topics Covered in Detail Introduction to Machine Learning • Hidden Markov Model (HMM) • Profile Hidden Markov Model (PHMM) • Principal Component Analysis (PCA) • And Singular Value Decomposition (SVD) • Support Vector Machine (SVM) • Clustering (emphasis on K-means and EM) • Neural networks (ANN) • Backpropagation and a lot more • Data analysis

Many Mini-Topics Introduction to Machine Learning k-nearest neighbor (k-NN) Boosting (AdaBoost) Random Forests (RF) Linear Discriminant Analysis (LDA) Vector quantization (VQ) Naïve Bayes and regression analysis Conditional Random Fields (CRF)

HMM Introduction to Machine Learning • We cover HMMs in greatest detail • More detail than other ML techniques • You must implement HMM from scratch • You must understand it well • Compare/contrast everythingto HMM • HMMs useful in many applications • And models themselves tell us something • Not always true of other algorithms

High Level View of HMM Introduction to Machine Learning Markov process, where states “hidden” Observations related to hidden states

HMM as Hill Climb • In contrast to heuristic search, such as genetic algorithm or simulated annealing • Advantage(s) of hill climb algorithm? • Disadvantages/limitations of hill climb? • Can we overcome disadvantages? Introduction to Machine Learning • Hill climb on parameter space • What is a hill climb? • Only go “up”, never “down”

PHMM Introduction to Machine Learning • Like HMM with positional information • Conceptually appealing • Details tend to be very problem specific • Widely used in bioinformatics • And other applications where position within sequence is critical information • Has been used successfully in security research (IDS, malware detection)

High Level View of PHMM Introduction to Machine Learning • Like defining a (particularly simple) HMM at each position in sequence • Easier to understand once we study HMM

PCA Introduction to Machine Learning • PCA serves to reduce dimensionality • Training is complex (lots of math) • But scoring is fast and efficient • So, when the dust settles, PCA is actually easy to apply and very efficient • Singular Value Decomposition (SVD) (almost) synonymous with PCA • SVD is one way to train a model in PCA

High Level View of PCA Introduction to Machine Learning • Finddirections with highest variance • Reveals useful info… • …dimensionality can be reduced • Why high variance? • Need lots oflinear algebra here…

SVM Introduction to Machine Learning • SVM has nice geometric interpretation • We can draw somepretty pictures! • In SVM, we increase the dimension • May be counterintuitive(compare to PCA) • SVM often used similar to other ML • But also ideal as a “meta-score” to combine multiple other scores • Great combination of theory, practice

High Level View of SVM Introduction to Machine Learning • Labeled training data • Separate the sets… • …and maximize margin • Easy to picture • We use lots ofcalculus to make sense of SVM • Most challenging derivationwe consider?

Clustering Introduction to Machine Learning • Usually, used for “data exploration” • I.e.,cluster hoping to discern structure from data that we know little about • Observed structure may or may not be meaningful (can cluster anything) • We consider 2 clustering techniques • K-means • EM (expectation maximization)

Clustering Example Introduction to Machine Learning • Unsupervised • Data exploration • K-means is easy and intuitive • EM more challenging • Some statistics… • Multivariate Gaussian distributions!

Neural Networks Introduction to Machine Learning • Artificial neural network (ANN) • A “mini topic” in the book • Now a major topic of the course • Focus is on backpropagation • Technique used to train neural networks • Essentially, a big calculus problem • And cover some of “alphabet soup” • RNN, CNN, LSTM, GAN, etc.

Mini Topics++ Introduction to Machine Learning • Boosting • Make arbitrarily strong classifier from many (weak) classifiers • Focus on AdaBoost (Adaptive Boosting) • Linear Discriminant Analysis (LDA) • Discuss this in some detail • Interesting connections to PCA and SVM

Mini Topics Introduction to Machine Learning • Random forest (RF) • Very popular and useful • Based on decision trees • We only cover the basics • K-nearest neighbor (k-NN) • Simplest “machine learning” imaginable • Vector quantization (VQ) • A generalization of K-means clustering

Mini Topics-- Introduction to Machine Learning • Conditional Random Fields (CRF) • Generalization of HMM • Interesting, but not so practical • Probably won’t have time for this • Naïve Bayes and regression analysis • And related statistical techniques • Interesting and practical • But, probably not enough time

The Dreaded Math… Introduction to Machine Learning • Striveto keep math to a minimum • Course is (mostly) self-contained wrt math • First semester calculus is assumed • To understand ML, cannot avoid math... • HMM/PHMM  discrete probability • PCA fancy linear algebra (eigenvectors) • SVM  calculus (Lagrange multipliers) • Clustering  statistics/probability • Backpropagation calculus (computational)

Data Analysis Introduction to Machine Learning • Critical to analyze data carefully • Especially true in research mode, as we must compare to previous work • Often, a major weakness in research! • We’ll discuss… • Experimental design, cross validation, accuracy, ROC curves, PR curves, imbalance problem, and so on

Applications Introduction to Machine Learning • Applications mostly from security • Malware detection or analysis  HMM, PHMM, PCA, SVM, and clustering • Masquerade detection  PHMM • Image spam  PCA and SVM • Classic cryptanalysis  HMM • Facial recognition  PCA • Text analysis  HMM • Old Faithful geyser  clustering

3 Stages of ML Enlightenment Introduction to Machine Learning • First stage elementary-school level • “Big picture” from 10k feet • See the descriptions in this intro • Second stage  drill down on big picture • More detailed and nuanced than first stage • Understand the pictures used in stage 1 • Third stage  real understanding • Learn derivations and (mostly) understand it

ML Enlightenment Introduction to Machine Learning • In this class, we aim for highest stage of ML enlightenment • But to pass the class, at a minimum, must have stage 2++ knowledge… • …and ability to effectively use ML … • … andunderstand strengths/weaknesses • Key to success • Work hard on homework and project!

Bottom Line Introduction to Machine Learning • We cover selected machine learning techniques in considerable detail • We discuss many applications, mostlyrelated to information security • Goal is for students to gain a deep understanding of the techniques • And be able to apply ML… • …especially in new and novel situations

How to Succeed in ML Class Introduction to Machine Learning • Ask questions • Good questions are good for everybody • Treat the math asyour friend • Math is needed to make sense of ML • Do not fear hard work! • Machine learning is not a spectator sport • Learning is a 3 step process • Read book, attend lecture, do homework

Intro to Machine Learning