780 likes | 787 Views
Googling, the Instrumented Web, Privacy and You. Greg Conti gregory.conti@usma.edu United States Military Academy West Point, New York.
E N D
Googling, the Instrumented Web, Privacy and You Greg Conti gregory.conti@usma.edu United States Military Academy West Point, New York
The views expressed in this presentation are those of the author and do not reflect the official policy or position of the United States Military Academy, the Department of the Army, the Department of Defense or the U.S. Government. The views expressed in this presentation are those of the author and do not reflect the official policy or position of the United States Military Academy, the Department of the Army, the Department of Defense or the U.S. Government. http://www.whitehouse.gov/omb/budget/fy2005/images/justice-7.jpg
The AOL Dataset Debacle SIGIR – IR List (August 2006) Subject: research.aol.com AOL is embarking on a new direction for its business making its content and products freely available to all consumers. To support those goals, AOL is also embracing the vision of an open research community. To get started, we invite you to visit us at http://research.aol.com, where you will find: • 20,000 hand labeled, classified queries • 3.5 million web question/answer queries (who, what, where, when, etc.) • Query streams for 500,000 users over 3 months (20 million queries) • 2 million queries against US Government domains Also, please feel free to provide feedback on the site, datasets you'd like to see in the future, and any other comments about our vision.
The AOL Dataset Debacle SIGIR – IR List (August 2006) Subject: research.aol.com AOL is embarking on a new direction for its business making its content and products freely available to all consumers. To support those goals, AOL is also embracing the vision of an open research community. To get started, we invite you to visit us at http://research.aol.com, where you will find: • 20,000 hand labeled, classified queries • 3.5 million web question/answer queries (who, what, where, when, etc.) • Query streams for 500,000 users over 3 months (20 million queries) • 2 million queries against US Government domains Also, please feel free to provide feedback on the site, datasets you'd like to see in the future, and any other comments about our vision. AOL Stalker AOL Psycho
Knowledge of the AOL Dataspill Question no vaguely somewhat very Are you familiar with the AOL data disclosure of August 2006? 84% 7% 7% 2%
Knowledge of the AOL Dataspill Question no vaguely somewhat very Are you familiar with the AOL data disclosure of August 2006? 84% 7% 7% 2%
Definitions googling: The full spectrum of free online tools and services (such as search, mapping, email, Web-based word processing and calendaring etc.) web-based information disclosure: the information we disclose as we surf the web
“Free” web tools and services aren’t free, we pay for them with micropayments of personal information.
“Never talk when you can nod, and never nod when you can wink, and never write an e-mail because it's death. You're giving prosecutors all the evidence we need.”- Eliot Spitzer Two Years before his resignation Eliot Spitzer Former-Governor of New York http://abcnews.go.com/Blotter/story?id=4424507&page=1
Maf54 (7:43:27 PM): well dont ruin my mental picture Xxxxxxxxx (7:43:32 PM): oh lol...sorry Maf54 (7:43:54 PM): nice Maf54 (7:43:54 PM): youll be way hot then Xxxxxxxxx (7:44:01 PM): haha...hopefully Mark Foley Former-US Congressman http://abcnews.go.com/WNT/BrianRoss/Story?id=2509586&page=2
Can anyone help me please! This stalking thing is not funny at all. When I type my name in keyword it gives a list of places that show where I have been on aol on the net. This is nobodys business. I have not done anything wrong at all and I have contacted aol about this matter and they keep saying they will do something about it but never do. -Debbie How do I get stuff removed from aol stalker? Can anyone tell me? Aol won't respond even though they claim willingness to remove data when requested. Someone, anyone, please help! -Sally http://blogs.ittoolbox.com/security/investigator/archives/aol-stalker-website-unleashed-11133
I could not resist asking [Colin Powell] about where he was when he realized the world had gone flat. He answered with one word: “Google.” Powell said that when he took over as secretary of state in 2001, and needed some bit of information, he would call an aide. “Now I just type into Google ‘UNSC Resolution 242’ and up comes the text.” -Thomas Friedman The World is Flat.
In the news… • Administration Demands Search Data; Google Says No; AOL, MSN & Yahoo Said Yes • http://blog.searchenginewatch.com/blog/060119-060352 • Hit Pause On The Evil Button: Google Assists In Arrest Of Indian Man • http://www.washingtonpost.com/wp-dyn/content/article/2008/05/18/AR2008051800657.html • Moroccan Man Jailed For Fake Facebook Profile • http://www.techcrunch.com/2008/02/07/moroccan-man-jailed-for-fake-facebook-profile/ • Group: Yahoo Assisted China With Torture • http://origin.foxnews.com/wires/2007Apr19/0,4670,YahooChina,00.html • Google ordered to give YouTube user data to Viacom • http://afp.google.com/article/ALeqM5hty1hXgakr7zoviTVNKalsStgSOw
Global Computing Statistics • World Population ~6.6 Billion • Cell Phones ~3.3 Billion • Personal Computers ~1.2 Billion • MP3 Players ~220 Million • Digital Cameras ~120 Million • Webcams ~100 Million • PDAs ~85 Million • DVRs ~44 Million • Servers ~27 Million Kevin Kelly, “The Planetary Computer.” Wired, 16.07, July 2008, pp52-55
Data Retention/Anonymization • Ask “hours” • Google 9 months • Microsoft 18 months • Yahoo 3 months • Other logs… • Other companies… • The cookie fallacy. • ISPs? http://www.webmonkey.com/blog/Yahoo_Trumps_Google_With_New_Data_Retention_Policy http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9027924&source=rss_news50 http://googlewatch.eweek.com/content/data_retention_policy/yahoo_not_google_moves_search_data_match_closer_to_an_endgame.html
Profiling • Google hackers • Security researchers • Political activists • Company XXX employee • Corporate leaders • Law enforcement officer • Government official “Career Watcher” “Active Gamer” Tacoda, The Home of Behavioral Targeting, http://www.tacoda.com/
ISPs vs. Large Online Companies Online Company • Sees global traffic from many customers • domain specific • Advertising and embedded content brings in additional information • Limited knowledge of user identity • Extensive datamining ISP • Sees all traffic from its set of customers • except encrypted traffic • traffic analysis • Limited to no visibility on non-customers • Knows identity and location of accounts • Ability to manipulate network flows • DNS • blocking P2P
ISPs vs. Large Online Companies Online Company • Sees global traffic from many customers • domain specific • Advertising and embedded content brings in additional information • Limited knowledge of user identity • Extensive datamining ISP • Sees all traffic from its set of customers • except encrypted traffic • traffic analysis • Limited to no visibility on non-customers • Knows identity and location of accounts • Ability to manipulate network flows • DNS • blocking P2P
ISPs vs. Large Online Companies Online Company • Sees global traffic from many customers • domain specific • Advertising and embedded content brings in additional information • Limited knowledge of user identity • Extensive datamining ISP • Sees all traffic from its set of customers • except encrypted traffic • traffic analysis • Limited to no visibility on non-customers • Knows identity and location of accounts • Ability to manipulate network flows • DNS • blocking P2P
Rogers ISP http://lauren.vortex.com/rogers-google.jpg
Myriad Disclosure Vectors • Search • Communications • Email / IM / SMS… • Advertising Networks / Purchasing • Other Web 2.0 innovations • Web office suites • Mashups • Location based services • Social networking • Cloud computing
Map Quest Mapping sites reveal locations of interest, allowing diverse groups of users to be linked.
Linked In Social networking sites know your contacts and your contacts’ contacts. Old friends will find you and let the site know of the relationship.
rot 13 Even the most innocent appearing services should be considered as collecting your data
Embedded Content • advertising • adsense • ebay advertising • affiliate networks • images • videos • embedded maps • + the usual suspects (doubleclick) • actively encouraged
Embedded Advertising Amazon MP3 Clips Widget
Embedded Images(globally recognized avatars) • http://www.gravatar.com/avatar/cf1e61a4330e75d5d1d7a744c5ef38c4?s=48&d=identicon&r=G
Embedded Video • <object width="425" height="350"><param name="movie" value="http://www.youtube.com/v/3PuHGKnboNY&hl=en&fs=1&autoplay=1"></param><param name="wmode" value="transparent"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/3PuHGKnboNY&hl=en&fs=1&autoplay=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" wmode="transparent" width="425" height="350"></embed></object>
A Visit to MSNBC 255.255.255.255 0.0.0.0
A Visit to MSNBC 255.255.255.255 0.0.0.0
a365.ms.akamai.net • a509.cd.akamai.net • ad.3ad.doubleclick.net • amch.questionmarket.com • c.live.com.nsatc.net • c.msn.com.nsatc.net • rad.msn.com.nsatc.net • context3.kanoodle.com • global.msads.net.c.footprint.net • hm.sc.msn.com.c.footprint.net • msnbcom.112.2o7.net • prpx.service.mirror-image.net • wrpx.service.mirror-image.net • switch.atdmt.com • view.atdmt.com • www-google-analytics.l.google.com • 16 third-party sites • 10 separate companies http://www.msnbc.msn.com/
Analysis Methodology • Visit top 25 Alexa sites • Using AdBlock Plus construct dataset of external objects • Create GraphViz graphs using dot • Dataset available • Automated, comprehensive tool development ongoing
Alexa Rankings(Top 25 in the United States) • Google • Yahoo • MySpace • YouTube • Facebook • Windows Live • MSN • Ebay • Wikipedia • AOL • Amazon • Craigslist • Blogger • Go • CNN • Photobucket • Microsoft • ESPN • Comcast • Flickr • Ask • Weather.com • Internet Movie DB • WordPress • New York Times