E N D
Josh Schmoldt The Data Mining Experts
Introduction My project is an investigation of data mining and Google. Hal Niedzviecki’s book “The Peep Diaries: How We’re Learning to Love Watching Ourselves and Our Neighbors” is my main inspiration behind this project. His thinking that the information we share can and will be used against us in the future seemed like an interesting topic to research.
Definition of Data Mining • The definition of data mining is the process of extracting patterns from data. • Data mining is seen in the internet by what you search and participate on the worldwide web. • All of this information is collected by the websites and if this information falls into the wrong hands some major problems might appear. • http://en.wikipedia.org/wiki/Data_mining
History of Data Mining From there I started looking into data mining and its history. Surprisingly the idea of “data mining” has been around for centuries. Some examples of data mining in the past are Bayes’ Theorem and regression analysis. Bayes’ Theorem (1700’s) shows how one conditional probability depends on its inverse. Regression analysis (1800’s) as we have learned is making a pattern for a set of information (bell curve).
The Empire • Ddfasdfadfasdfadfasdf The reason I decided to focus on Google instead of other search engines or websites is because they are home to 75 percent of all searches made on the internet. Because of that simple statistic they are also home to the most information/data than any other website.
How Does Google Collect the Data? • For every search you make on their website they record the cookie ID, your Internet IP address, the time and date, your search terms, and your browser configuration. Increasingly, Google is customizing results based on your IP number. This is referred to in the industry as "IP delivery based on geolocation."
Google’s Data Collection cont. • The main question I had after reading about their data collection is Why? • Why do they need to record all this information? • Should we be scared or worried about all of this data being collected? • Isn’t this breaking privacy laws?
Why Do They Record Everything? • The main reason they record all of the searches is because it represents a huge money making opportunity. • As everyone in this class probably already knows, most ads you see on the internet are targeted towards you by what you search and what websites you visit.
$ Money Making $ • Since Google controls most of the search engine market companies are pretty much forced to go to Google and buy the information they have on all of us. • This is why it must be so enticing for Google since they are making a ridiculous amount of money by selling this information. • For the record Google made $1.65 billion in the third quarter of this year.
Should We Be Worried? • After my research I have concluded that we shouldn’t be worried for now… • I believe that the information isn’t being used negatively for the most part. Mostly just advertisements as mentioned before. • The real worry is that if the government can get a hold of all of the information. • Even if this does happen it still might not necessarily be a bad thing.
Worried cont. • The worry behind the government having access to the information is that they might misinterpret the data. • They possibly could see curiosity as probable cause. • For example if you were to search “how to make a bomb” just because you were curious. The government could see it as a warning that you are trying to kill people.
Privacy Laws • Google collecting your IP addresses isn’t illegal because it is just a computer address and doesn’t include your name or other personal things. • But if you have ever been curious and searched your name on Google then there is a linking between your searches and your IP Address that Google has.
Sources • http://www.spiegel.de/international/germany/0,1518,587546,00.html • http://www.wired.com/epicenter/2009/10/google-profits-up-3q-200/ • http://www.aquick.org/blog/2006/01/29/whats-the-big-fuss-about-ip-addresses/ • http://www.readwriteweb.com/archives/do_you_trust_google_to_resist_data_mining_across_services.php • http://www.google-watch.org/bigbro.html