280 likes | 294 Views
Explore the problems and challenges faced by a Big Data Society, including the exponential growth of data, issues of privacy and ownership, and the complexity of unstructured data. Discover implications and potential solutions.
E N D
Big Data Society What are the problems and challenges facing a ‘Big Data Society?’ Lauren Fitzgerald & Jacqui Campbell
Big Data • The rise of social media has meant a concomitant explosion in data that we produce. This data exceeds virtually all the information we have produced from the beginning of literacy to the turn of the millennium. Furthermore this data becomes motile the minute we press send- that is, it moves, aggregates
Big Data Society • http://www.youtube.com//watch?v=EuC0uOcT2C
Big Data Society The rise of social media has meant an associated explosion in the data that we produce. This data exceeds virtually all the information we have produced from the beginning of literacy to the turn of the millennium. Furthermore this data becomes motile the minute we press send- that is, it moves, aggregates, and combines autonomously, wholly free from our control. We are ending this unit by examining the idea of ‘big data’ and its implications. What are the problems and challenges facing a ‘big data society’
What is Big Data? • According to Wikipedia ‘Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.’ • The information that millions of us are generating about ourselves amounts to a data set of unimaginable size and growing complexity. • Big data" changes the research landscape for the humanities and social sciences.
Big Data Society According to Wikipedia • The term ‘Big Data Society’ was coined as the name given to the vast amounts of data we produce as a result of going about our everyday lives. • Example: surfing the internet, paying a bill, picking up groceries
Our Data, Our SelvesL. Neyfahk • What if privacy is keeping us from reaping the real benefits of the infosphere? • L. Neyfahk talks about the vast amount of data we are creating on a daily basis without even realising • What if this data was used for a significant purpose?
Our Data, Our SelvesL. Neyfahk ‘Even if this thought terrifies you, there’s not much you can do: As most of us know by now, we’re all leaving a trail of data behind us’ Neyfahk, L 2011
Our Data, Our SelvesL. Neyfahk ‘Taken together, the information that millions of us are generating about ourselves amounts to a data set of unimaginable size and growing complexity: a vast, swirling cloud of information about all of us and none of us at once, covering everything from the kind of car we drive to the movies we’ve rented on Netflix to the prescription drugs we take.’ L. Neyfahk
Our Data, Our SelvesL. Neyfahk ‘Up to now, the public conversation on this kind of data has taken the form of an argument about privacy rights, with legal scholars, computer scientists, and others arguing for tighter restrictions on how our data is used by companies and the government, and consumer advocates instructing us on how to prevent our information from being collected and misused.’ L. Neyfahk
Our Data, Our SelvesL. Neyfahk “But a small group of thinkers is suggesting an entirely new way of understanding our relationship with the data we generate. Instead of arguing about ownership and the right to privacy, they say, we should be imagining data as a public resource: a bountiful trove of information about our society which, if properly managed and cared for, can help us set better policy, more effectively run our institutions, promote public health, and generally give us a more accurate understanding of who we are. This growing pool of data should be public and anonymous, they say — and each of us should feel a civic responsibility to contribute to it.”
Data Commons • There should be a ‘Public garden’ of anonymous information made available for researchers working in the public interest. Jane Yakowitz • “There are patterns and trends that none of us can discern by looking at our own individual experiences,” Yakowitz said. “But if we pooled our information, then these patterns can emerge very quickly and irrefutably. So, we should want that sort of knowledge to be made publicly available.”
Who Owns this Data? • Data does not just belong to one person or company • -Google • -Visa • -Governmental agencies • -The Department of Education • -Census Bureau
Big Data Society • “A full 90% of all the data in the world has been generated over the last two years.” • SINTEF(2013)
What are the Implications of Big Data • Data can be misused • No privacy • Leaving a trail
The Challenges of Big Data Big data presents a number of challenges relating to its complexity. One challenge is how we can understand and use big data when it comes in an unstructured format, such as text or video. Another challenge is how we can capture the most important data as it happens and deliver that to the right people in real-time. A third challenge is how we can store the data, and how we can analyze and understand it given its size and our computational capacity. And there are numerous other challenges, from privacy and security to access and deployment.
The Opportunities of Big Data But even greater than the challenges are the opportunities that big data presents. McKinsey calls big data “the next frontier for innovation, competition and productivity.” We can answer questions with big data that were beyond reach in the past. We can extract insight and knowledge, identify trends and use the data to improve productivity, gain competitive advantage and create substantial value for the world economy. The challenges with big data are limited compared to the potential benefits, which are limited only by our creativity and ability to make connections among the trillions of bytes of data we have access to.Big data provides an opportunity to find insight in new and emerging types of data. Big Data shows how people use your product in ways you hadn’t expected. It provides terrific insight into how you might improve the product’s software, communicate with customers and build loyalty.
Growth of Big Data • This web content and data are infinitely larger than all already digitized cultural heritage, and, in contrast to the fixed number of historical artifacts, is grows constantly. • According to IBM, we create 2.5 quintillion bytes of data every day. 90% of the data we have created in the past two years and the amount of data is expected to increase exponentially. The data we create is expanding rapidly as enterprises capture more data in greater detail, as multimedia becomes more common, as social media conversations explode and as we use the Internet to get things done. This is “big data,” and it’s getting even bigger
Big Data as a Public Resource “But a small group of thinkers is suggesting an entirely new way of understanding our relationship with the data we generate. Instead of arguing about ownership and the right to privacy, they say, we should be imagining data as a public resource: a bountiful trove of information about our society which, if properly managed and cared for, can help us set better policy, more effectively run our institutions, promote public health, and generally give us a more accurate understanding of who we are. This growing pool of data should be public and anonymous, they say — and each of us should feel a civic responsibility to contribute to it.”
Application Programming Interface • A researcher can obtain some data through APIs provided by most social media services and largest media online retailers (YouTube, Flickr, Amazon, etc.) • API is a set of commands that can be used by a user program to retrieve the data stored in a company’s databases. • For example, Flickr API can be used to download all photos in a particular group, and also retrieve information about each photo size, available comments, geo location, list of people who liked this photo ect.
Data Classes People and Organisations can now be divided into three different categories - Those who create data (both consciously and by leaving digital footprints) - Those who have the means to collect it - Those who have expertise to analyze it.
Getting access to transactional data The detailed knowledge and insights that before can only be reached about a few people can now be reached about many more people.
Data analysis divide between data experts and researchers without computer science training. The detailed knowledge and insights that before can only be reached about a few people can now be reached about many more people.
Authenticity of Data We need to be careful of reading communications over social networks and digitalfootprints as “authentic.”
Questions • Should there be a public garden of knowledge available for researchers? • Would you make your anonymous information available for researchers to use in the public interest? • Do you worry about your privacy in this day and age? Why?
Readings Manovich, L 2012, ‘Trending: The Promises and the Challenges of Big Social Data’, in M Gold (ed), Debates in the Digital Humanities, The University of Minnesota Press, Minnesapolis, viewed 3 July 2013, <http://lab.softwarestudies.com/2011/04/new-article-by-lev-manovich-trending.html
Readings ‘Data, data everywhere’, The Economists, 25 February 2010, viewed 1 July 2013 http://www.economist.com/node/15557443
Readings Neyfahk, L 2011, ‘Our Data Ourselves’, Boston Globe, 22 May, viewed 10 July 2013, http://www.boston.com/bostonglobe/ideas/articles/2011/05/22/our_data_ourselves/?page=full