390 likes | 569 Views
Right-Time Data Warehousing with OWB. Objectives. After completing this lesson, you should be able to: Identify business case for right-time data warehousing Design near real-time trickle feed mappings Use Streams queue operators Use Change Data Capture code templates. Lesson Agenda.
E N D
Objectives • After completing this lesson, you should be able to: • Identify business case for right-time data warehousing • Design near real-time trickle feed mappings • Use Streams queue operators • Use Change Data Capture code templates
Lesson Agenda • Does OWB support real-time or near-real-time data warehousing? • Real-time and near-real-time concepts • Batch versus trickle feed • OWB support for advanced queues • Building a trickle feed mapping • Using CDC code templates for Change Data Capture
What Is Meant by Real-Time Data Warehousing • Definitions of real-time data warehouse (RTDW) are broad. • One of the defining characteristics of a RTDW is that the data freshness is “as real-time as it needs to be.”* * from The Data Warehouse Institute • Two main categories of data freshness: • Pure real time • Near real time Live or nearly live updates
What Refresh Frequency Does OWB Support? • OWB supports near real-time data warehousing. • We will refer to this as right time data warehousing. • OWB supplies “building blocks”; users construct solutionsthat fit their needs.
OWB Supports Multiple Loading Approaches • Three loading approaches: • Full data refresh into empty tables • Incremental data refresh • Continuous trickle feed approach Similar to batch load approach, though intervals can be very short New to OWB 11g R2,trickle feed mappingsbased on Oracle Streamsfeature of the databaseserver, and the AdvancedQueue operator of OWB;this approach canconsume changed data in near real time. New to OWB 11g R2,Change Data Capture (CDC) code template frameworkcaptures incremental changes from heterogeneous sources“out of the box.”
Trickle start stop … map logic map logic Trickle Feed: Start/Stop
Mapping Can Consume or Produce Data to And from Queues • Mappings can consume data from queues. • Near real-time consumers (trickle feed mappings) • Batch style consumers (including Change Data Capture) • Mappings can produce data for queues. Trickle feed mapping CDC code template mapping
OWB Supports Two Types of Advanced Queues • Typed AQ • Batch, only • For stronglytyped payloads • ANYDATA AQ • Near real-timeor batch
Streams Queues Can Stage Logical Change Records (LCR) or Non-LCR Messages • Oracle Streams queues can stage either LCRs or non-LCR messages. • However, OWB requires a user to declare whether an advanced queue will stage either LCRs or messages. • In other words, OWB does not support advanced queues staging both LCRs and messages. Streams queues Advanced queue LCR or Message
Consume from a Queue Batch or real-time Consumefrom aqueue
Produce Information into a Queue Produceto aqueue
Queue-to-Queue Propagation Streams queues Propagate Streams tags • Permitted • Not permitted Non-Streams queues Propagate • Transformations • Rules
Lesson Agenda • Does OWB support real-time or near-real-time data warehousing? • Real-time and near-real-time concepts • Batch versus trickle feed • OWB support for advanced queues • Building a trickle feed mapping • Using CDC Code Templates for Change Data Capture
Building a Trickle Feed Mapping • We examine the following steps for building a very simple • trickle feed mapping for near real-time data warehousing. 1. Create a receiving advanced queue. 2. Create a queue table in the target schema. 3. Deploy the advanced queue and the queue table. 4. Create a mapping the uses the advanced queue. • Choose the queue to serve as a real-time source. • Choose the primary data type. • Map the payload from the queue to a new target table. 5. Deploy the table and the mapping. 6. Drop messages into this queue and watch them load immediately.
Add Object Type Expander for Payload Expander lets you “see” inside the message, in this case SUBJECT and TEXT.
Define the Streams Administrator User • Define Streams administrator user in OWB. • Add to module.
Configure the Mapping Real-Time Parameters • Define the degree of parallelism. • Generate table instantiation (not used). • Define the Streams administrator location. • Define the apply process name.
Practice 5-1 Overview: Building a Trickle Feed Mapping • This practice covers all of the steps for building a trickle feed mapping for near real-time data warehousing. • Creating a receiving advanced queue • Creating a queue table in the target schema • Deploying the advanced queue and the queue table • Creating a mapping that uses the advanced queue • Choosing the queue to serve as a real-time source • Choosing the primary data type • Mapping the payload from the queue to a new target table • Deploying the table and the mapping • Dropping messages into this queue and watching them load immediately
Lesson Agenda • Does OWB support real-time or near-real-time data warehousing? • Real-time and near-real-time concepts • Batch versus trickle feed • OWB support for advanced queues • Building a trickle feed mapping • Using CDC code templates for Change Data Capture
Change Data Capture Framework • Orchestrates process to capture changes in near real-time • Open framework supports multiple platforms. • CDC templates seeded for Oracle, DB2 UDB, SQL Server • Typical mechanisms • Trigger based (Oracle, IBM, Microsoft) • Log based (Oracle and IBM)
Simplified CDC Mapping Steps 1. Choose how to perform CDC (trigger or log mechanism). 2. Select tables upon which to perform CDC. 3. Start the capture process. 4. Define the subscribers to receive the changed data. 5. Define mappings to consume the changes.
Choose How to Perform CDC • Decide whether you want to perform CDC via a mechanism that employs triggers or logs. • Select the CDC code template that matches your need.
Select Tables Upon Which to Perform CDC From which tables do youwant to capture changed data?
Define Subscribers Subscriber consumes the changes.
Define Execution Units and Deploy Define execution unit;assign code template to it. Deploy mapping, targettable, location.
Using Web Services to Administer CDC: Select Run-Time Services Web services to administer CDC
Quiz • Are all of the following statements true? • Support for right-time data warehousing in OWB 11g Release 2 refers to near real-time, not pure real-time. • Use trickle feed mappings to achieve near real-time data warehousing. • OWB 11g Release 2 supports both “Typed” and “ANYDATA” advanced queues in the Oracle database, with Typed AQs used in batch mode only, and ANYDATA AQs used in either batch or near real-time processing. • The OWB CDC framework provides “out of the box” CDC templates seeded for Oracle, DB2 UDB, and SQL Server
Summary • In this lesson, you should have learned how to: • Identify business case for right-time data warehousing • Design near real-time trickle feed mappings • Use Streams queue operators • Use Change Data Capture code templates
Practice 5-2 Overview: Steps for Using CDC Code Templates for CDC 1. Create a module. • Edit the module to select a CDC code template. • Select JOB_HISTORY as the table whose changed data will be captured. 2. Create the mapping that uses the CDC code template to consume changes. • Set CDC properties on the JOB_HISTORY table operator. • Create an execution unit; assign CDC code template to it. 3. Deploy the mapping’s target table and the mapping. 4. Start the Change Data Capture process. 5. Add a subscriber. 6. Add a new record to the JOB_HISTORY table. 7. Start the mapping to observe only the new record inserted.