SoK: Security and Privacy in Machine Learning

SoK: Security and Privacy in Machine Learning Nicolas Papernot∗, Patrick McDaniel∗, Arunesh Sinha†, and Michael P. Wellman† ∗ Pennsylvania State University † University of Michigan {ngp5056,mcdaniel}@cse.psu.edu, {arunesh,wellman}@umich.edu Presented by Jonny Dowdall CSE 914 2 Feb 2019

What is machine learning? Relevant terms A few math proofs Threats What are the major security threats in machine learning? Solutions What can we do about these threats? Machine Learning Objectives

Machine Learning An Over-Simplification

Machine Learning A computer’s ability to infer something based on experience rather than being explicitly programmed. An Over-Simplification

Machine Learning Given some input, produce some output. Supervised Learning

Machine Learning Train by providing explicit input and output pairs. Supervised Learning

Machine Learning Identify patterns in unlabeled data. Unsupervised Learning

Machine Learning No training in the conventional sense. Unsupervised Learning

Machine Learning Given a state, perform an action that leads you to an eventual desired state. Reinforcement Learning

Machine Learning No explicitly correct output for any input. Reinforcement Learning

Machine Learning Train by having an agent interact with an environment. Reinforcement Learning

Machine Learning Reward/penalize agent for reaching desired states. Reinforcement Learning

Machine Learning Typically, input data is represented as a vector (or matrix/tensor) of values. Call these input values features. Features

Machine Learning Some function transforms the input features into target output(s). Parameters

Machine Learning Function is typically a weighted sum of input features. Parameters

Machine Learning Call the weights parameters. Parameters

Machine Learning Start with random weights. Loss function

Machine Learning Grab a sample from the training data and compute some (bogus) output value(s). Loss function

Machine Learning The loss function tells us how far predicted output is from ground-truth output. Loss function

Machine Learning The gradient of the loss function tells us how to change our weights. Gradient

Machine Learning Next time we see that training sample, the loss should be smaller. Training

Machine Learning Rinse and repeat with all training samples until loss is low and weights stop changing. Training

Machine Learning Now, use these weights to predict values for new, unobserved inputs. This is our model. Inference

Confidentiality/Privacy Uncovering ML model itself, which may be confidential intellectual privacy. Uncovering users’ data, which may be private (model input/output). Integrity Exploiting the model to produce outputs that don’t correspond to patterns in training data. Availability Preventing access to an output or action induced by a model output. Threats (CIA)

Confidentiality/Privacy Uncovering ML model itself, which may be confidential intellectual privacy. Uncovering users’ data, which may be private (model input/output). Exploiting the model to produce outputs that don’t correspond to patterns in training data. Preventing access to an output or action induced by a model output. Integrity Threats (CIA)

White-box vs. Black-box Adversaries White-box Black-box • Access to the machine-learning model internals • Can see parameters/architecture • Only able to interact with model • Can provide input and view output

Papernot et al.

Confidentiality

Model extraction • Recover model parameters by testing many input output pairs (Tramer et al). • Requires access to class probabilities.

Privacy

Membership Attack • Test whether or not a data point is in the training set. • Exploit differences on model’s confidence to identify points that were trained on (Shokri). • Generate synthetic data until model produces an output with very high confidence.

Model Inversion • Fredrikson et al. present the model inversion attack. • Given the output of a model, predict the input. • For a medicine dosage prediction task, they show that given access to the model and auxiliary information about the patient’s stable medicine dosage, they can recover genomic information about the patient.

Analyzes the privacy guarantees of algorithms. Differential Privacy Framework

An algorithm’s output should not differ significantly statistically for two versions of the data differing by only one record (Dwork et al). Differential Privacy Framework

Add randomization to the machine learning pipeline. Solutions?

Randomly add noise to every input while training. Solutions? Local Privacy

E.g. Google Chrome collects user data with a probability q that the data is real and 1-q that the data is random. Resulted in meaningful and privacy-preserving statistics (Erlingsson) Solutions? Local Privacy

It was shown that adding random noise to the loss function during training provides differential privacy (Chaudhuri). Solutions? Perturbing loss

Randomly perturbing gradients before applying parameter updates guarantees even stronger differential privacy (Abadi). Solutions? Perturbing gradients

Randomly perturb output values. Solutions? Noisy output

This method degrades performance. Solutions? Noisy output

“Precisely quantifying the learning algorithm’s sensitivity to training points is necessary to establish differential privacy guarantees. For nonconvex models (e.g., neural nets), current loose bounds on sensitivity require that learning be heavily randomized to protect data—often at the expense of utility.”

Integrity

Training attack • Poisoning: input/label manipulation • Test distribution is different from training distribution • Model learns to behave in an unintended way • Example • Intentionally label genuine emails as spam

Assumption: Poisoned data points are typically outside the expected input distribution. Solution: Reduce the influence of outliers in training. Solutions?

Use a modified PCA algorithm: • Maximize the variance between training samples while reducing variance of outliers. (Rubinstien et al.) Solutions? Reduce outlier sensitivity

Adding a regularization term to the loss function, which in turn reduces the model sensitivity to outliers (Stempfel and L. Ralaivola). Solutions? Reduce outlier sensitivity

Inference attack • Adversarial example • Exploit weakness in model • Perturb a sample until it is misclassified • E.g. How can we get past malware detection by changing our malware as little as possible?

Inference attack • The more information provided in the output, the easier this is to exploit. • Especially easy when the system is an oracle. • Adversary can issue queries for any chosen input and observe the model output.

Inference attack • Lowd and Meek introduce ACRE learnability, which poses the problem of finding the least cost modification to have a malicious input classified as benign using a polynomial number of queries to the ML oracle.

SoK: Security and Privacy in Machine Learning

SoK: Security and Privacy in Machine Learning

Presentation Transcript

Privacy and Security:

A Primer on Machine Learning, Classification, and Privacy

Security and Privacy

Security (and privacy)

Security and privacy in provenance

Security and privacy

Distributed Machine Learning: Communication, Efficiency, and Privacy

Security and privacy

Security and Privacy

Security and Privacy

Security and Privacy

Security and Privacy

Security and Privacy

Privacy and Security

Security and Privacy

Security and Privacy

Privacy and Security

PRIVACY AND SECURITY

Security and Privacy