150 likes | 276 Views
SVC04. Petabytes for Peanuts! Making sense of “Ambient Data”. Dave Campbell & Friends Microsoft Corporation. Key Takeaways…. Massive shift in how we process data Incredible data volumes Remaking how we discover Changing the Scientific Method Reducing latency & impedance
E N D
SVC04 Petabytes for Peanuts! Making sense of “Ambient Data” Dave Campbell & Friends Microsoft Corporation
Key Takeaways… • Massive shift in how we process data • Incredible data volumes • Remaking how we discover • Changing the Scientific Method • Reducing latency & impedance • Extreme Scale Data Processing • Stream Processing (Several Views) • From “programs” to “queries” • What’s up with this “anti-SQL” stuff anyhow?
1997 Storage Cost: $~1.00 Transfer Time: ½ hour 2009 Storage Cost: ~0.1₵ Transfer Time: 8 sec. 1982 Storage Cost: $~2000 Transfer Time: 1 day “Free” Storage Power
Ambient Data? Over 84 percent of Americans have cell phones, according to Steve Largent, president and CEO of CTIA. While two trillion minutes were used in 2007, an 18 percent increase over 2006 talk times. More than 48 billion text messages were sent in the month of December 2007, an average 1.6 billion messages per day. The rate of text messaging represented a 157 percent increase over December 2006 texting. http://www.clickz.com/3628985 • Text Message Traffic in US: 160GB / day 58TB / year • Voice traffic in US (GSM encoding) 200PB / year
The Old World • Data volumes constrained by human typing speed • App & Data formed closed system App Assume 200M people in US typing 8 hr / day @ 10K keystokes / hour: 2TB/hror ~6PB / year DB
The Old New World Available data exploded Available Data Questions toAnswer What data shouldwe throw out? Design Schema Design ETL What if we have a new question? DW Nirvana!
The New World of Abundant Data Save All Available Data Hypothesize Theorize Test New Question to Answer AlgorithmicProcessing Run “query” over data… Exploit Correlation… Correlation is Enough! Analyze reduced data The CMS front end of the Large Hadron Collider records 1TB/sec! http://blogs.discovermagazine.com/cosmicvariance/2006/09/27/lhc-factoids/ Interesting Read: The Petabyte Age: Because More Isn't Just More — More Is Different http://www.wired.com/science/discoveries/magazine/16-07/pb_intro
Analyze Model Monitor 1 Event Stream both stored and processed Event Processing Engine 4 Produce real time alerts and action Event Stream Alerts & Action 3 Models installed in event processing engine Correlation Model 2 Analysis produces event correlation models Analysis
StreamInsight demo Roman Schindlauer Program Manager SQL Data Stream Engine
Extreme Scale Data Processing Source DW Traditional Data Warehouse Source Source ETL Source Source Analysis / Reporting Source Source Extreme ScaleData Processing DW Non-traditional Sources 2 1 All data retained and reprocessed Majority of data filtered or discarded Analysis / Reporting Analysis
LINQ to “whatever”… demo Erik Meijer Architect (& more…) BPD Cloud Programmability Team
YOUR FEEDBACK IS IMPORTANT TO US! Please fill out session evaluation forms online at MicrosoftPDC.com
Learn More On Channel 9 • Expand your PDC experience through Channel 9 • Explore videos, hands-on labs, sample code and demos through the new Channel 9 training courses channel9.msdn.com/learn Built by Developers for Developers….