Physical Layer Attacks on Unlinkability in Wireless LANs

Physical Layer Attacks on Unlinkability in Wireless LANs Kevin Bauer* Damon McCoy* Ben Greenstein+Dirk Grunwald* Douglas Sicker* * University of Colorado+ Intel Research Seattle

Our Wireless World tcpdump Link Layer Header Link Layer Header Link Layer Header Link Layer Header Link Layer Header PrivatePhoto1.jpg Home location=(47.28,… Buddy list: Alice, Bob, … PrivateVideo1.avi Blood pressure: high Our wireless devices reveal lots of information about us

Best Security Practices for 802.11 Bootstrap tcpdump SSID: Bob’s Network Key: 0x2384949… Username: Alice Key: 0x348190… Out-of-band (e.g., password, WiFiProtected Setup) 802.11 probe Is Bob’s Network here? 802.11 beacon Bob’s Network is here Discover Authenticate and Bind 802.11 auth Proof that I’m Alice 802.11 auth Proof that I’m Bob • Confidentiality • Authentication • Integrity Send Data 802.11 header 802.11 header 2

Problem: Short-Term Linking tcpdump 12:34:56:78:90:ab 12:34:56:78:90:ab, seqno: 1, … 12:34:56:78:90:ab 12:34:56:78:90:ab, seqno: 2, … 00:00:99:99:11:11, seqno: 102, … 00:00:99:99:11:11 12:34:56:78:90:ab, seqno: 3, … 12:34:56:78:90:ab 00:00:99:99:11:11, seqno: 103, … 00:00:99:99:11:11 Alice -> AP 12:34:56:78:90:ab, seqno: 4, … 12:34:56:78:90:ab Alice -> AP Alice -> AP 00:00:99:99:11:11, seqno: 104, … 00:00:99:99:11:11 Easy to isolate packet streams using addresses, seq nums

Problem: Short-Term Linking DFT • Isolated data streams are susceptible to side-channel analysis using packet size and timing information • Exposes keystrokes, VoIP calls, webpages, movies, … • [Liberatore, CCS ‘06; Pang, MobiCom ’07; Saponas, Usenix Security ’07; Song, Usenix Security ‘01; Wright, IEEE S&P ‘08; Wright, Usenix Security ‘07] 100 250 500 300 200 120 ≈ transmission sizes transmission sizes Device fingerprints Video compression signatures Keystroke timings

Solution: Encrypt the Entire Frames Which packets are transmitted by which devices? “SlyFi”, MobiSys ’08 tcpdump 3-9 data streams overlap each 100 ms, on average Unlinkability is achieved

Our Goal: Short-Term Linking Using Physical Layer Information • State-of-the-art methods requirespecialized and expensive hardware [Brik, Mobicom ’08; Danev, Usenix Security ‘09] • We want to perform short-term transmitter packet linking using low-cost commodity hardware tcpdump Charlie -> AP ??? -> AP Charlie -> AP ??? -> AP Alice -> AP ??? -> AP Charlie -> AP ??? -> AP Charlie -> AP ??? -> AP Charlie -> AP ??? -> AP

Talk Outline ✓ Motivation and Goals Physical Layer Packet Linking Experimental Evaluation Solution: Introduce Noise

Signal Strength Background RSSI values can be obtained using commodity 802.11 radios and drivers tcpdump Increasing distance -85 dB Eavesdropper Decreasing RSSI -50 dB -65 dB Noise floor Received signal strength indication(RSSI) fades as transmissions travel further

Real World Signal Strength Behavior Physical Location Signal Strength (dB) Received signal strength is influenced by the transmitting device’s physical location

Packet Linking with Device Localization • We first try to link packets by location • RSSI values fluctuate due to environmental noise • Supervised learning algorithms: RSSI  location mapping • We use k-nearest neighbors [Bahl, Infocom ’00] But localization requires training data, which is expensive and time consuming to collect

An Unsupervised Approach We’re not interested in mapping packets to location, just linking packets to transmitters tcpdump Use a clustering algorithm to handle noise

More Details • Use k-means to classify packets by transmitter • n listening sensors • Feature vector: (RSSI1, RSSI2, … , RSSIn) • k-means is probabilistic may not find a globally optimal solution • Heuristic: Run 100 times to get a stable solution • Meets our goal: Requires only commodity 802.11 hardware, stock drivers, and no training

Talk Outline ✓ ✓ Motivation and Goals Physical Layer Packet Linking Experimental Evaluation Solution: Introduce Noise

Experimental Evaluation Collect real signal strength data in a 75m × 50m office building 5 passive monitors and 58 different measurement positions Our dataset is available in CRAWDAD wireless trace repository: http://crawdad.cs.dartmouth.edu/cu/rssi

Packet Clustering Accuracy • Adversary uses 5 sensors to record packets’ RSSI values • Generate 100 random device configurations • Clustering accuracy > 75% for all experiments • Accuracy using localization-based approach performs worse • (see paper for details) But is this good enough to enable interesting traffic analysis? Higher = Better Vary the number of transmitters from 5-25 • k-means is very accurate at clustering packets using RSSI

Website Fingerprinting Accuracy • Attack: Encrypted website fingerprinting using [Liberatore and Levine, CCS ‘06] • Naïve Bayes classifier to identify websites after clustering packets Higher = Better • Simple traffic analysis task performs well

Talk Outline ✓ ✓ ✓ Motivation and Goals Physical Layer Packet Linking Experimental Evaluation Solution: Introduce Noise

Solution: Vary Transmit Power Intuition: We expect tight, separable clusters Goal: Make the clusters overlap Cluster is now larger, more likely to overlapwith other clusters: this introduces more clustering errors • Varying transmit power introduces more noise in RSSI

Solution: Directional Antenna Intuition: Focus signal in different directions: creates “phantom” clusters Inexpensive “cantenna” 1 device, 4 distinct clusters • Using a directional antenna causes fluctuation in RSSI

Combined: Clustering Accuracy • 15 transmitters total • Vary number of devices that add noise • Decreases clustering accuracy from 80% to 50% • Traffic analysis accuracy decreases from 40% to 26% for devices that add noise Lower = Better • Both solutions decrease clustering accuracy

Other Potential Solutions • Anonymity (still) loves company • The more devices, the better • Devices close together have similar clusters • Wireless cover traffic • Devices transmit “dummy traffic” to frustrate side channel attacks • Wireless shared medium  degrades performance • Physical security, jamming, frequency hopping • Performance implications, may not be effective • Physical layer info is hard to control

Conclusion • Wireless devices are becoming personal and pervasive • Information present at the physical layer can lead to privacy leaks • Short-term linking: Side-channel attacks • Defenses to mitigate attacks • Introducing additional noise reduces clustering accuracy • More research is needed to help address privacy risks exposed by the physical layer

Backup Slides

How many sensors are enough? Almost no gain after three sensors

Empirical stream interleaving • Many streams interleaved at short timescales

Why use k-means? k-means performs well with spherical patterns It’s simple, yet it out-performed other clustering methods on our task

How does distance effect accuracy? Two transmitters at different distances Measured accuracy of k-means

What if attacker doesn’t know k? Even if attacker can approximate k, website fingerprinting attack can still perform well

Related Work • Device Distinction • Detect MAC spoofing [Faria, WISE ‘06] • Doesn’t generalize to k devices • Uses multipathing to detect spoofing [Patwari ‘07] • Uses non-commodity hardware • RF Fingerprinting • Uses electromagnetic signature [Hall ‘05] • Uses expensive non-commodity hardware • Uses modulation fingerprinting [Brik ’08,Danev ‘09] • Relies on signal analyzer hardware

Clustering accuracy: F-measure Weighted harmonic mean of precision and recall: 1. In terms of information retrieval: tp: true positive fp: false positive fn: false negative 2. In terms of classification: Homogeneity of each cluster Extent to which packets are clustered together

k-Means Clustering Algorithm • Input: Data set and number of clusters k • Initialization: Select initial cluster centroids by choosing k data points at random • Repeat until cluster membership is stable: • Compute the distance from each data point to each of the k centroids • Group the data points by their closest centroid • Compute the new cluster centroids • k-means minimizes the residual sum of squares

Why does clustering perform better than localization for linking? • Surprising result • Training means it should be better, right? • But, localized packets have error (3.5 meters at the median) so we need to cluster the localized packets by their location predictions • Errors from localization and clustering steps are additive

Estimating k from data where μi is the centroid of cluster Si • k-means tries to minimize the within-cluster residual sum of squares • Choose ks.t. the within-cluster sum of squares is minimized using cross validation • Works best when clusters are separable

Physical Layer Attacks on Unlinkability in Wireless LANs

Physical Layer Attacks on Unlinkability in Wireless LANs

Presentation Transcript

Wireless LANs

Wireless LANs

Wireless LANs

Wireless LANs

Wireless LANs

On Physical-Layer Identiﬁcation of Wireless Devices

Wireless Transmission Fundamentals (Physical Layer)

Wireless LANs

Wireless LANs

Issues in Wireless Physical Layer

Wireless LANs

Wireless LANs

Wireless LANs

Wireless LANs

Wireless LANs

Wireless LANs

Physical layer: Wireless Transmission

Physical layer: Wireless Transmission

Wireless LANs