370 likes | 706 Views
DAT301 Introduction to Complex Event Processing with SQL Server 2008 R2 StreamInsight. Torsten Grabs Lead Program Manager Microsoft Corporation. Understanding Streaming Data (1). Question: “how many red cars are in the parking lot”. Answering with a relational database:
E N D
DAT301Introduction to Complex Event Processing with SQL Server 2008 R2 StreamInsight Torsten Grabs Lead Program Manager Microsoft Corporation
Understanding Streaming Data (1) • Question: “how many red cars are in the parking lot”. • Answering with a relational database: • Walk out to the parking lot. • Count vehicles that are • Red • Cars SELECT COUNT(*) FROM ParkingLot WHERE type = ‘AUTO’ AND color = ‘RED’
Understanding Streaming Data (2) • What about: “How many red cars have passed the 40th street exit on the 520 in the last hour”? • Answering with a relational database: • Pull over and park all vehicles in a lot,keeping them there for an hour. • Count vehicles that are in the lot. Doesn’t seem like a great solution…
Understanding Streaming Data (3) • Different kinds of questions require different ways of answering them. • Answering the question with a streaming data processing engine: • Stand by the freeway, count red cars as they pass by. • Write down the answer, deliver the answer. This is the streaming data paradigm in a nutshell – ask questions about data in flight.
What is StreamInsight • Microsoft’s platform to build applications for streaming data • Continuous and incremental processing • High throughput, low latency • Event-driven computation • Declarative query language (LINQ) • Adapter model • Diagnostic interface • Extensibility model • Needs a SQL Server 2008 R2 License • Datacenter • Standard, Enterprise
The Value of Timely Analytics $ value of analytics Web Analytics – Ad placement, Financial Services, Smart Grids, Monitoring – Systems mgmt, Health Care, Manufacturing, etc. Forecasting in Enterprises Historical Trend Analysis Time of interest Present
Current Products for Analytics Load barrier is dictated by current choices of the solution, e.g., loading into databases, persisting into files. This is intrinsic because in current approaches no processing can be done till the data is loaded. Facts/sec. Custom-built solutions that carry huge development and customization costs Active DW analytics Traditional DW Analytics Time of interest Present ET time in ETL Load time in ETL
StreamInsight and the Microsoft Platform Visualization Distribution Processing Caching Data Bus Sources Refresh (Push) Microsoft StreamInsight Operational Analytics Cache Operational Dashboard (Ticking - Snapshot) Message Bus Devices, Sensors Reference Data Refresh(Push) Automated Decisions Reporting Dashboard (Refreshed) Web servers In-memory Database ETL Re-compute (Pull) Static Reports Intra-Day Cubes Stock tickers & News feeds ETL Service Broker Historic Cubes Mining, Validation, “What-If” Scenarios
Event-Driven Applications Event Analytical results need to reflect important changes in business reality immediately and enable responses to them with minimal latency request output stream input stream response
Latency Scenarios for Event-Driven Applications Relational Database Applications CEP Target Scenarios Operational Analytics Applications, e.g., Logistics, etc. Data Warehousing Applications Web Analytics Applications Manufacturing Applications Financial trading Applications Monitoring Applications Aggregate Data Rate (Events/sec.)
StreamInsight Platform StreamInsightApplication Development StreamInsight Application at Runtime Event sources Event targets Input Adapters Output Adapters StreamInsight Engine Devices, Sensors Pagers & Monitoring devices Standing Queries KPI Dashboards, SharePoint UI Web servers Query Logic Query Logic Trading stations Event stores & Databases Query Logic Event stores & Databases Stock ticker, news feeds
Example Scenarios • Manufacturing: • Sensor on plant floor • React through device controllers • Aggregated data • 10,000 events/sec • Power, Utilities: • Energy consumption • Outages • Smart grids • 100,000 events/sec • Web Analytics: • Click-stream data • Online customer behavior • Page layout • 100,000 events /sec • Financial Services: • Stock & news feeds • Algorithmic trading • Patterns over time • Super-low latency • 100,000 events /sec Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds Data Stream Data Stream Visual trend-line and KPI monitoring Batch & product management Automated anomaly detection Real-time customer segmentation Algorithmic trading Proactive condition-based maintenance Stream Data Store & Archive Asset Specs & Parameters Event Processing Engine • Threshold queries • Event correlation from multiple sources • Pattern queries Lookup
Financial Services • Scenario: Real-time RiskContinuous insight into market conditions and exposure • Continuous low-latency market monitoring • Manage risks across traders and per desk with aggregate and individual thresholds • StreamInsight advantage: • Implement risk monitoring declaratively in LINQ • Detect and notify in near real-time on risk • No change to models or LINQ code necessary for back-testing over historical data
Market Monitoring demo
DemoScenario: Market Monitor Dashboard Push Monitoring StreamInsight Ad-hoc Queries • Quotes: • MSFT • IBM Push Grouping Aggregation Output Adapters Input Adapters OLAP Pull Push
Web Analytics • Scenario: Real-time Behavioral Targeting • Continuously analyze online behavior per user • Identify relevant content before the next click • Define content behind next click based on detected online behavior • StreamInsight advantage: • Scale to millions of concurrent online users • Immediate insight - real time analytics • Web logs no longer processed offline in batches • Correlate across your web farms and applications
Demo Scenario: Web Activity Monitoring • Acquire click-stream feed from web servers • Provide dashboards for web activity • US states heat map • Top 5 sites
Logistics • Scenario: Fleet Management • Track current position of your vehicles • Continuously optimize routes • Optimize vehicle utilization • Schedule maintenance based on vehicle conditions • StreamInsight advantage • Gain immediate insight from sensor and event data • Expressive built-in algebra for analytics • Easy to extend with domain-specific libraries • Include static reference data into your calculations
Demo Scenario: Microsoft Shuttle Tracker • Plot current position for Redmond campus shuttles • Track specific shuttles • Identify when shuttles approach specific destinations • Proximity queries with SQL Spatial Libraries
Power Utilities • Scenario: Smart grid • Instrument households with smart power meters • Continuous, up-to-date insight into your grid, including generation, distribution, and demand • StreamInsight advantage • Scales to smart grids requirements • Scale to millions of meters • Hundreds of thousands of meter readings per second • Write validation, editing, estimation (VEE) rules declaratively in LINQ • Scale to the high data volumes expected in smart grids • React in almost real-time to changing grid conditions to avoid power outages
Retail (Online and Traditional) • Scenario: Real-Time Coupon • Provide most relevant/appealing coupon • Maximize expected individual customer revenue • Correlate current sales transaction with customer purchase history • StreamInsight advantage • Track current market basket as a real-time stream • Use StreamInsight lookup pattern to correlate current market basket with purchase history • Easily scale to internet retail with millions of concurrent sessions
Event Types • StreamInsight events in use the .NET type system • Events are structured and can have multiple fields • Fields are typed using the .NET framework types • StreamInsight engine provisioned timestamp fields capture all the different temporal event characteristics • Event sources populate time stamp fields • All calculations done based on “business time”
Event Streams & Adapters • A stream is a sequence of events • Defined over a .NET type • Possibly infinite • Stream characteristics: • Event/data arrival patterns (steady, bursty) • Out of order events: Order of arrival of events does not match the order of their application timestamps • Adapters • Receive/get events from the data source • Enqueue events for processing in the engine • Insertions of new events • Changes to event durations
StreamInsight Query Features • Operators over streams • Calculations (PROJECT) • Correlation of streams from different data sources (JOIN) • Check for absence of activity with a data source (EXISTS) • Selection of events from streams (FILTER) • Stream partitioning (GROUP & APPLY) • Aggregation (SUM, COUNT, …) • Ranking and heavy hitters (TOP-K) • Temporal operations: hopping window, sliding window • Extensibility – to add new domain-specific operators
LINQ Query Examples LINQ Example – JOIN, PROJECT, FILTER: from e1 in MyStream1 join e2 in MyStream2 on e1.ID equals e2.ID where e1.f2 == “foo” select new { e1.f1, e2.f4 }; Filter Project &Aggregate Project Window Grouping Join LINQ Example – GROUP&APPLY, WINDOW: from e3 in MyStream3 group e3 by e3.i intoSubStream fromwin inSubStream.HoppingWindow( FiveMinutes,ThreeSeconds) selectnew { i = SubStream.Key, a = win.Avg(e => e.f) };
StreamInsight Deployment Alternatives Stream-Insight CEP for lightweight processing and filtering Stream-Insight CEP for aggregation and correlation of in-flight events Stream-Insight CEP for complex analytics including historical data • Event processing engines are deployed at multiple places on different scales • At the edge – close to the data source • In the mid-tier – consolidate related data sources, • In the data center – historical archive, mining, large scale correlation. Web servers Sensors Stream-Insight Stream-Insight Feeds Devices Stream-Insight Stream-Insight Stream-Insight Stream-Insight Stream-Insight Stream-Insight Complex Analytics & Mining
SQL Server 2008 R2 Capabilities by Edition Parallel Data Warehouse Standard Datacenter Enterprise Workload
StreamInsight Platform: Recap Development experience with .NET, C#, LINQ and Visual Studio 2008 and 2010 StreamInsightApplication Development CEP platform from Microsoft to build event-driven applications StreamInsight Application at Runtime Event sources Event targets Input Adapters Output Adapters StreamInsight Engine Devices, Sensors Pagers & Monitoring devices Standing Queries Event-driven applications are fundamentally different from traditional database applications: queries are continuous, consume and produce streams, and compute results incrementally Flexible adapter SDK with high performance to connect to different event sources and sinks The CEP platform does the heavy lifting for you to deal with temporal characteristics of event stream data KPI Dashboards, SharePoint UI Web servers Query Logic Query Logic Trading stations Event stores & Databases Query Logic Event stores & Databases Stock ticker, news feeds
For More Information • StreamInsight product main page: http://www.microsoft.com/sqlserver/2008/en/us/R2-complex-event.aspx • StreamInsight blog: http://blogs.msdn.com/streaminsight/ • Latest version: StreamInsight 1.1 with support for .NET 4.0 collections: http://blogs.msdn.com/b/streaminsight/archive/2010/10/25/releasing-streaminsight-v1-1.aspx • StreamInsight MSDN documentation: http://msdn.microsoft.com/en-us/library/ee362541(SQL.105).aspx • StreamInsight samples http://streaminsight.codeplex.com/
Related Sessions: • DAT301-LNC: Operational Intelligencewith StreamInsight • Speakers: Craig Bauhaus (Logica), Torsten Grabs • Wednesday 1:20 – 2:05pm • DAT 302: Advanced StreamInsight Querying Techniques • Thursday 6:00 – 7:00pm • Speaker: Roman Schindlauer
Session Evaluations Tell us what you think, and you could win! All evaluations submitted are automatically entered into a daily prize draw* Sign-in to the Schedule Builder at http://europe.msteched.com/topic/list/ * Details of prize draw rules can be obtained from the Information Desk.
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.