600 likes | 1.18k Views
SAP University Alliances Version 3.0 Authors Klaus Freyburger Tobias Hagen. SAP Netweaver Business Intelligence – ETL Process. Abstract: This chapter covers the data extraction, transformation and loading process of SAP Netweaver Business Intelligence. . Product
E N D
SAP University Alliances Version 3.0 Authors Klaus Freyburger Tobias Hagen SAP Netweaver Business Intelligence – ETL Process Abstract: This chapter covers the data extraction, transformation and loading process of SAP Netweaver Business Intelligence. Product SAP NetWeaver 7.3 BI Level Undergraduate Beginner
Agenda Architecture Overview SAP NetWeaver BI Data Flow Overview Source Systems Data Sources Transformations Data Flow Control Real-time Summary
Architecture of a Data Warehouse Access to Information OperationalData Store (Architected) Data Marts Data Warehouse (Persistent) Staging Area Any Source • Data extraction • Data tranformation • Data loading and data flows
Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary
Data Flow in SAP Netweaver BI Info Spoke SAP Netweaver Business Intelligence InfoProvider Open Hub Desti- nation * Transformation Data Transfer Process InfoSource Transformation any target Data Source InfoPackage SAP SAP NetWeaver BI NON SAP any source *optional
Data Flow Concept – Simple Example Infoprovider (DTP) TR • E:- Extraction • T:- Transformation [TR] • L:- Loading • PSA:- Persistent stagingarea (IP)
Data Flow Concept – Simplified Infoprovider TR (DTP)
Data Transfer Process – More Sophisticated Infoprovider
InfoSource Scenario: InfoSource as a uniform source for several targets and as target from different sources Example: unit conversion / currency conversion
Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary
Different Types of Source Systems SAP Netweaver Business Intelligence DataSource File Interface BI Service BAPI DB Connect UD Connect Web Service Staging BAPIs ETL Tool File SAP Source Relational Source Multi-Dimensional Source XML Legacy Applications *
File Interface • File Interface • File name, number of header rows, data format, number format, field list, preview, ….
SAP Source • SAP Business Content DataSources reside in SAP ERP, SAP CRM, … • SAP Business Content DataSources exist in SAP BW after replication Data Source Replication Service API SAP ERP OLTP System Transfer structure User Exit Extraction structure Customizing Application specific extration
SAP Source • SAP BW system collects in SAP BW all objects required to activate an InfoCube
Database Connectivity (DB Connect) • Any relational database • Access to table / view • Field selection and data type conversation
Universal Data Connect DataSources (UDC) • Industry standard: J2EE connection architecture >200 3rd party database drivers (JDBC) • Works with SAP Netweaver J2EE connection framework (used for application integration in SAP‘s portal)
Web Service • Web Service push data to PSA • Web Service is SOAP compliant • The incoming XML file will be transformed into SAP BW format • Used for business activity monitoring or real-time data delivery • Note: only for small data volume (due to XML structure overhead) Data Example with SAP BW see http://www.dbai.tuwien.ac.at/proj/lixto/WebBI.pdf
3rd Party ETL Tool Connector • SAP BW provides open interfaces - staging BAPIs • These interfaces enable connection between various third-party tools and SAP BW • For example, data from an Oracle application can be transferred to SAP BW and evaluated there • Data transfer can take place via a data request from SAP BW or be triggered by the third-party tool via BAPIs • See vendor’s documentation for more details BAPI :- Business Application Programming Interface
Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary
Types of DataSources • A DataSource is the SAP BW system view to data in the source system SAP Netweaver Business Intelligence DataSource File Interface BI Service BAPI DB Connect UD Connect Web Service Staging BAPIs ETL Tool File SAP Source Relational Source Multi-Dimensional Source XML Legacy Applications *
DataSource • How to extract? (Source system dependent) • Full upload vs. Delta upload • Data location • Data format specifications • Etc.
DataSource • DataSource describes the fields of external data source • Fields are in internal SAP BW format • Error handling is often required! • What to extract? • Field selection • Data types and length • …
Examples of conversions Source numberformat 1.234,56 hastobespecified in DataSource In alphanumeric field numbers should be stored with leading zeros in database, e.g., `0000001000` instead of `1000` Conversion Routine ALPHA hastobespecified In the GBI Curriculum, there is no need to specify conversion routines!
InfoPackages Move Data from Source Systems DataSource • When to extract? • Time handling • Data target definition • Exception handling • Selection criteria • Preview and Monitoring • … InfoPackage „Successfully loaded to PSA“ SourceSystem
Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary
Transformations in SAP NW BI – Overview • Transformation types • Move, aggregate, constant, master data look up, … • Business rules, e.g. unit & currency translation • Formula builder with rich predefined functions library • ABAP routines incl. regular expressions • Transformation: Record by record processing • Start / End Routine: Data set processing SAP Netweaver Business Intelligence Data Target End Routine Transformation - Field 1 Transformation - Field n Start Routine Data Source
Transformation – User Interface Transformation Rules Target Fields Source Fields
Transformation Rules • Rule Types: • Constant assignment • Direct assignment • Reading master data (eg. product group from Infoobject) • ABAP Routine • Formula editor • Time conversion, unit conversion • Aggregation • Sum, min, max, first, last, avg Directassignment
Transformation Rules (Formula) Formulaassignment
Transformation Rules (Formula editor) Formula Source Fields Functions
Transformation Rules (Time conversion) FromCalendar Day … … toYear/Monthonly
Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary
Full vs. Delta Data Upload • Use Data Transfer Process to set the extraction mode • Full: Extract all data from Data Source • Delta: Extract data only from Data Source which have changed since last upload InfoPackage attached to DataSource Set extraction mode
Monitor for Data Transfer Process • Integrated in InfoProvider Management • Integrated in DataTransferProcess Management • Error Stack is displayed in Monitor Error Stack
Error Handling for DataTransfer Process • Error Handling Options: • Valid Records Update, No Reporting at all (status FAIL) • No Update, No Reporting (status FAIL) • Valid Records Update, No Reporting (status FAIL) • Valid Records Update, Reporting possible (status OK) SAP Netweaver Business Intelligence Data Target Error DTP Transformation – Error Handling DTP Data Source
Master Data Upload from Flat Files An InfoObject represents a type of business entity Attributes describe the characteristics of a business entity Texts may have multilingual descriptions of an attribute and must therefore be loaded as a single process Hierarchies represent groups of business entities. There may be multiple hierarchies. Characteristics InfoObjects Attributes Hierarchies Multi lang. texts
Master Data Upload from Flat Files without InfoPackage 4. Create and start Data Transfer Process 3. Create Transformation 2. Create DataSource 1. Select Source System
Master Data Upload from Flat Files with InfoPackage 5. Create and start Data Transfer Process 4. Create Transformation 3. Create InfoPackage and load data into PSA 2. Create DataSource 1. Select Source System
Master Data Flow - Global Bike 7. Create andstart Data Transfer Process 2 6. Create Transformation for Texts 5. Create andstart Data Transfer Process 6 5 4. Create Transformation for Attributes 4 7 3 3. Create newDataSource 2. Add existingInfoObject Material 1. Create Data Flow
Transactional Data Upload 5. Create and start Data Transfer Process 4. Create Transformation 3. Create InfoPackage and load data into PSA 2. Create DataSource 1. Select Source System
Transactional Data Flow - Global Bike 10. Create andstart Data Transfer Processto Cube NetSales:= Revenue - Discount CostofGoodsManufactured := SalesQuantity * TransferPrice 9. Create Transformation 8 Calmonth:= Caldayas YYYYMM 9 10 8. AssignexisitingInfoCube 7. Activate Data in DataStoreObject 7 4 6. Create andstart Data Transfer Processto DSO 5 6 5. Create Transformation Calday:= Year & Month & Day 4. Create DataStoreObject 2 3. Create InfoPackageandload PSA 3 2. Create newDataSource 1. Create Data Flow from Template
Data Transfer Process - Process Chain A process chain is a sequence of processes that are scheduled to wait in the background for an event. Each process can trigger one or more events that trigger other processes in turn. In the administration function area of the Data Warehousing Workbench, choose Process Chains from the navigation pane to call the process chain maintenance. A process chain is a very complex area. It Automates the complex schedules in BW with the help of the event-controlled processing Visualizes the processes using network graphics Centrally controls and monitors the processes To reduce complexity, we‘ll skip this now, but we need to cover this in the future
Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems Data Sources Transformations Data Flow Control Real-time Summary
Real-time Data Acquisition Demon Realtime Update External System Realtime Update Application SAP Source System
Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems Data Sources Transformations Data Flow Control Real-time Summary
Summary Loading Master Data Loading Transactional Data