410 likes | 767 Views
Systemübergreifende Prozessüberwachung mit Event Stream Processing. Thomas Dücker , SAP Schweiz Juni , 2013. Agenda. Herausforderung und Trends Übersicht Operational Process Intelligence Übersicht Beispiele Szenario Mehrwert für Sie Weitere Informationen. Event Stream Processing.
E N D
SystemübergreifendeProzessüberwachungmit Event Stream Processing Thomas Dücker, SAP SchweizJuni, 2013
Agenda Herausforderung und Trends Übersicht Operational Process Intelligence • Übersicht • Beispiele • Szenario MehrwertfürSie WeitereInformationen
Event Stream Processing • MasterData Alarm Market Data Web Data Adjust ESP Sensor Data Decision Support
Traditional Approaches to Event Stream Processing Database Application Custom Application “Black-box” Application Application or Dashboard High-Speed Data Queries / Results High-Speed Data Alerts / Actions • DB application (OLTP) • Pull not push (i.e. not event driven) or • Requires triggers – performance degradation / maintenance nightmare • latency in seconds • Customapplication (C/C++ / Java) • Specialized, high development / maintenance cost • Slow to change, unresponsive to the business
The SAP Approach – Sybase ESP Combines advantages of customized application with ease of visual data flow diagramming or SQL Incoming data is processed as it arrives, according to the model rules and operations Publish streaming results to other models, apps, message bus, dashboard, etc. Visual / SQLAuthoring Dashboards Output Streams SUBSCRIBE Input Streams PUBLISH Reporting Tools Orders, Trades Market Data ? Trading Systems Console Input Message Bus Reference Data SAP HANA / Sybase IQ
CEP Product Convergence Coral8 v5.6.5 Aleri 3.2 Sybase CEP R4 Sybase ESP 5.0 Optional Sept 2011 Sybase ESP 5.1 Sept 2012 Sybase ESP 5.1 SP1 Dec 2012 (maintenance only)
Advantages ofSybase ESP Approach • Analyze Events as they occur • Continuous Insight • Respond Immediately • Rapid Application Development • Reduce/eliminate dependence on specialist programming skills for process • Cut implementation/deployment time • Broad out of the box connectivity • Non-intrusive Deployment • Event-driven integration with existing systems • Unify existing disparate data models
Event Stream Processing Scenarios • Capital Markets • React to incoming ticks to determine trading strategies, monitor the success of the strategies by calculating PnL in real-time; monitor risk in real-time. • Retail • Provide clickstream analysis for online customer assistance / alternatives, use web scraping to react to competitor price changes
Event Stream Processing Scenarios • Energy • Monitor and optimize energy usage using Smart Metering to ensure QoS • Utilities / Manufacturing • Service Status Monitoring and Process Control using SCADA / OSISoft
Event Stream Processing Scenarios • Telecoms • Monitor events for QoS, use DPI to detect fraudulent use • Police / Homeland Security • ANPR event correlation for traffic monitoring, security
ESP Project Architecture Authoring Tools Operational Console LogStore C+C (XML-RPC) Command & Control Studio Security (PAM, SSL, RSA, Kerberos) Publish Subscribe Stream Processor + Multi-threaded + 64 bit - Suse 11 / Redhat 6 / Solaris 10 (Sparc / i86) Win x64 + Low latency + Optional persistence Data Streams Data Streams Security Security I/O Gateway I/O Gateway Adapters Adapters Data Streams Data Streams Mem Store LogStore
ESP Cluster ArchitectureI Manager Nodes Cache Controller Controller Controller Project Server Project Server Project Server
ESP Cluster Architecture II • Manager Node • Deploys project(s) to server(s) • Maintains heartbeat with project(s) to detect failure • Manages failover • In multi node, managers are run in a cache so can failover to manage other controller nodes • Controller Node • Basically acts as a launch daemon for Manager to start project(s) in a Container
ESP Cluster Architecture III • Single node cluster – • Single manager/controller node runs multiple projects on a single machine • Detects when a project fails and tries to restart it • This is the default for project development • Multi node cluster • Multiple equi-peer managers and controllers run multiple projects on multiple machines • Used to scale out projects across multiple machines • Allows for server failure – promotes failover recovery and data redundancy • Allows for Highly Available “active:active” deployment of projects within a Data Centre
ESP Installed Clusters • Three clusters created by install script: - • Studio Cluster (single node) on 9786 : – • Used for Studio Development • Can run multiple projects • Located in $ESP_HOME/studio/clustercfg • User-defined Cluster (single node) on 19011 • Located in $ESP_HOME/cluster/nodes/node1 • Example multi-node Cluster on 19011 - 19014 • Located in $ESP_HOME/cluster/examples
Analyst-level skills (Excel, VBA) Easy to understand complex models No need to learn language syntax Authoring Options – Eclipse Plugin Visual Dataflow Authoring CCL Authoring • Rapid programming • Easy to use language (CCL – derived from SQL) • Modular, project based approach
ESP Modularity IMPORT 'CreateCompleteChannel.ccl'; ---------- CREATE MODULE CreateCompleteChannel IN MessageIn, CacheIn OUT MessageOut BEGIN // Module Code END; LOAD MODULE CreateCompleteChannel AS Channel_A IN MessageIn = MessageA, CacheIn = CacheA OUT MessageOut = MessageOutA; LOAD MODULE CreateCompleteChannel AS Channel_B IN MessageIn = MessageB, CacheIn = CacheB OUT MessageOut = MessageOutB;
Project Code, Compiled Code & Project Resources • Project Source Code .CCL (Continuous Computation Language) • Used for schema, stream, window, method definitions compiled into • Project Executable .CCX (Continuous Computation eXecutable) • Project Resource File .CCR (Continuous Computation Resource) • An xml file used for in process adaptor, cluster, bindings, failover, HA definitions – allows for easy migration between DEV, TEST and PROD
ESP Project Container Trades Tick data feed Console Positions Constructing ESP Projects • Build model • Connect to data feeds using publish adaptor • Connect to output devices via subscribe adaptor DB Replicated Data
Stateful Events • Events are retained as records in memory with a primary key so that further events may modify the state of the record - either update or delete it. • A stateful event must have PK attribute(s) set. • A stateful event MUST have an associated operation code (opcode), one of INSERT, UPDATE, UPSERT, DELETE. If no opcode is provided, ESP defaults it to UPSERT. • When an event occurs it applies the opcode through the data flow model depending on the opcode. An insert in a source stream may become an update to an aggregate or join stream. • The full set of unique (insert) events to be retained MUST fit in memory. • Records in memory can be removed either by setting a retention window, or by publishing a DELETE event for that key. • In ESP, stateful events are implemented using a WINDOW
Stateless Events • An event is retained only while it is being processed. When an event moves from one stream to another it is removed from the first stream so it cannot be updated. • All events are treated as insert. • A stateless event ignores any PK attribute(s) and always outputs an insert • In ESP stateless events are implemented with a STREAM
ESP Elements ESP implements stateful and stateless events using 3 element types • WINDOW : used for stateful events. A WINDOW maintains state for a user defined period • STREAM : used for stateless events. A STREAM does not maintain or react to state • DELTA : used to process stateful events, but without maintaining state • All can be INPUT, LOCAL, OUTPUT • INPUT subscribe to input adaptors i.e. events are published to an INPUT. They are the entry points for events to enter an ESP project. • LOCAL perform processing logic. They can NOT be published or subscribed to • OUTPUT are derived elements which perform processing logic which CAN be subscribed to
Windows (stateful) • WINDOWS must have a PK set • Can have an optional retention period set • Maintain state in memory or persisted to disk. i.e. State can be recovered in the event of system failure • Can publish to STREAM, WINDOW or DELTA • Only a WINDOW can be used for aggregation • At lease one side of a join MUST be a WINDOW (unless using SPLASH). The side which maintains state depends on the type of the join. • Use WINDOW when developing models as they help debugging
Streams (stateless) • Stream • Cannot be persisted so any data is lost in the event of system failure • Can NOT publish to DELTA stream • Can publish to a WINDOW, but WINDOW must aggregate the input • Delta • Can only be LOCAL, OUTPUT • A stateless element (a STREAM) that understands and processes state (a WINDOW) • Can only use a WINDOW as a source • Must have a PK set • As it does not store events it can NOT be used for aggregation. • Cannot be persisted so any data is lost in the event of system failure
Position Window Trades Stream BookId* Symbol* Shares Held Symbol* Time* Price Shares Aggregate AverageTrades Window Positions By Book Stream Total Book Window Symbol* Last Price Weighted Avg Price Last Time BookId* Symbol* Current Position Average Position BookId* Current Position Average Position Aggregate Join Example: Simple Position Maintenance
Trades Stream Symbol* Time* Price Shares AveragePrices Window Book Positions Window Symbol* Last Price Weighted Avg Price Last Time BookId* Current Position Average Position Example: Simple Position Maintenance Aggregate Individual Positions Stream BookId* Symbol* Current Position Average Position Aggregate Join Position Window BookId* Symbol* Shares Held
Event Processing Solution Architecture • Events don’t happen in isolation • Most Event Stream Scenarios need access to historic data • Pattern Detection • Risk Calculation • Optimisation algorithm • To ensure sub millisecond response times use in memory e.g. HANA • Many Event Stream Scenarios need access to master data • Data written directly to OLTP system • Updates to reference, semi-static data • For sub second updates without polling use log trawler e.g. Replication Server
Real-time ANPR Alerting Framework OCR Radars In-Memory Monitoring CEP Engine Complex Event Processing Event Stream Processor CEP Engine Data Mining Predictive Analysis BI Presentation Layer Business Inteligence Input Adapter Output Adapter SQL SQL Data Bases Storing and recovering In-database Algorithms Analytical data Historic data Reference data HANA BO Predictive Analysis Predictive models (Ad-hoc analysis) SQL ETL - Data handling BusinessObjectsDataService External Databases
Real-time Retailing Framework Is competitor price<95% of our price? Manage Exception GUI Detect Event Validate Event Check Results Straight Through Processing Output ESP Is optimum price<95% of current our price? Run Optimization Algorithm HANA Phase 1: Event Detection & Validation Phase 2: Processing Phase 3: Validation & Exception Phase 4: Output
Integration with Hana I • Generic input / output ODBC adapter • Data can enter via a source stream
Integration with Hana II • Can be called from a FlexStream using SPLASH for event driven query: - ON TRADE { if (TRADE.Price <> optimalPrice) { [ string s; | ] rec; vector (typeof(rec) v := new vector(typeof(rec)); getData(v,'HanaDSN','CALL DPTL.sp_price_opt (?, ?, ?, ?)', TRADE.Id, TRADE.Datetime, TRADE.Sector, TRADE.Price); if (not (isnull(v))) { newrec:= v[0]; } // rest of method } }
Analytics: • Heatmaps • Drill-downs • Reports Bus Trading Systems Trading Systems • Real-time Risk Framework Continual Mark t o Market Transformation of granular data CEP Real-time Aggregation of Risk and P&L • Showing: • Concentration limits • Clustering • Outliers • Etc… • Cleansing • Normalisation • Enrichment Trading Systems Market Data Replication Server • By: • Instrument • Industry • Country • Debt Rating • Counterparty • VaR Time Horizon • Etc… HANA Trading Systems Risk Sensitivities • Scenario Shifts • VAR calculations ASE • Reference Data
Oil & Gas - Energy Efficiency Reporting Framework ECC (IS-U) BO 4.0 Panopticon RA#1 Oracle RS (Replication Server) ESP (Event Stream Processor) HANA RS HANA ODBC OKO Non SAP (top level MES) SOI2 RA#2 Oracle SOI2 server Replication flow SCADA SCADA SCADA Event flow
Besten Dank Kontaktinformation: Thomas Dücker SAP Schweiz thomas.duecker@sap.com