E N D
1. DB2 9.7 Technology Update Salvatore Vella, VP, Development
2. Sports Examples – More Wins Per Dollar
3. The Information Challenge
4. Information On DemandUnlocking the Business Value of Information for Competitive Advantage
5. Information On DemandEnd-to-End Capabilities
6. IBM DB2: Lower Cost Made Easy Additional accolades:
Software & Information Industry Association—2007 CODiE Award Winner, Best Database Management Solution
Rapid development – also includes easier application porting (more of this is coming in 2009)
Optimized for SOA:
pureXML™ for seamless and efficient integration of XML and relational data
Improved performance for dynamic warehousing and OLTP requirements
Industry’s Highest Level of Scalability & Availability:
Unmatched availability for superior information access
Near-linear scalability for mission critical SAP, ERP and CRM requirements
Manage Risk and Streamline Compliance:
Unparalleled security capabilities for greater control and business flexibility
Increased auditing capability and efficiency
Additional accolades:
Software & Information Industry Association—2007 CODiE Award Winner, Best Database Management Solution
Rapid development – also includes easier application porting (more of this is coming in 2009)
Optimized for SOA:
pureXML™ for seamless and efficient integration of XML and relational data
Improved performance for dynamic warehousing and OLTP requirements
Industry’s Highest Level of Scalability & Availability:
Unmatched availability for superior information access
Near-linear scalability for mission critical SAP, ERP and CRM requirements
Manage Risk and Streamline Compliance:
Unparalleled security capabilities for greater control and business flexibility
Increased auditing capability and efficiency
7. Themes Lowest Cost
Best performance with most efficient utilization of available resources
Ongoing Flexibility
Allow for continuous and flexible change management
Service Level Confidence
Expand your critical workloads confidently and cost effectively
XML Insight
Harness the business value of XML
Break Free with DB2
Use the database server that gives you the freedom to choose
Balanced Warehouse
Create table ready warehouse appliance with proven high performance
8. Lower Storage Costs with Deep Compression Best in industry
Minimize storage costs
Improve performance
Easy to implement DB2 9.5 has industry leading data compression technology that is transparent to DBAs, users, and applications. We call it “Deep Compression”. DB2 automatically builds a compression dictionary, and uses it so that the most frequently repeated values in data rows are compressed. These repeating values can span columns, which makes the compression even more effective. The way it works is that these repeating patterns within the table are replaced by a smaller symbol, and the real values are then stored in the compression dictionary. In addition to reducing disk storage space, it can also lead to disk I/O savings since more rows are accessed per I/O operation which in turn leads to faster query performance. Deep Compression can also enable more data to be cached in the buffer pool, thereby leading to increased bufferpool hit ratios which again can improve performance.
Customers such as SunTrust have reported compression rates over 80% which as you can imagine resulted in significant storage cost savings! In the SunTrust case the initial savings were more than $2 million which is a great example of how DB2 can lower your total cost of ownership!
DB2 9.5 has industry leading data compression technology that is transparent to DBAs, users, and applications. We call it “Deep Compression”. DB2 automatically builds a compression dictionary, and uses it so that the most frequently repeated values in data rows are compressed. These repeating values can span columns, which makes the compression even more effective. The way it works is that these repeating patterns within the table are replaced by a smaller symbol, and the real values are then stored in the compression dictionary. In addition to reducing disk storage space, it can also lead to disk I/O savings since more rows are accessed per I/O operation which in turn leads to faster query performance. Deep Compression can also enable more data to be cached in the buffer pool, thereby leading to increased bufferpool hit ratios which again can improve performance.
Customers such as SunTrust have reported compression rates over 80% which as you can imagine resulted in significant storage cost savings! In the SunTrust case the initial savings were more than $2 million which is a great example of how DB2 can lower your total cost of ownership!
9. Automated Compression for All Form of Ingest Not only does DB2 compression save you storage space and deliver better performance, with DB2 9.5 you also get a completely automated solution.
All that is required from a database administrator is to indicate to DB2 which tables in the database should be considered “compression eligible” and DB2 will do the rest. As data is added to these tables (either through insert, update, load, or import operations), DB2 will monitor the size of the table. If and when a table exceeds a given threshold size (as determined by DB2) a compression dictionary will be created automatically on the fly and any new data added to the table will be automatically compressed.Not only does DB2 compression save you storage space and deliver better performance, with DB2 9.5 you also get a completely automated solution.
All that is required from a database administrator is to indicate to DB2 which tables in the database should be considered “compression eligible” and DB2 will do the rest. As data is added to these tables (either through insert, update, load, or import operations), DB2 will monitor the size of the table. If and when a table exceeds a given threshold size (as determined by DB2) a compression dictionary will be created automatically on the fly and any new data added to the table will be automatically compressed.
10. Features currently in development:
Multiple algorithms for automatic index compression
Automatic compression for temporary tables
Intelligent compression of large objects and XML Of course DB2 is not standing still in the area of storage savings either. Storage costs continue to be a concern for customers and so the upcoming release of DB2 codenamed Cobra will deliver even more storage savings.
Two examples of this are in the area of index compression and TEMP table compression. In DB2 Cobra, you will not only be able to compress tables but DB2 will also provide automatic compression for indexes as well. In fact there are a number of ways to compress indexes based on the type of index and the data distribution within that index. DB2 will be able to examine the index data and automatically chose from a set of compression algorithms that provides an optimal compression for the index.
Similarly DB2 will automatically compress temporary data. This can be a huge savings for data warehouse applications where large sorts and large intermediate result sets can consume a significant amount of storage. In DB2 Cobra, temporary tables used for sorting and intermediate query results will automatically be compressed if they consume a significant amount of storage thus saving you even more on your storage costs.Of course DB2 is not standing still in the area of storage savings either. Storage costs continue to be a concern for customers and so the upcoming release of DB2 codenamed Cobra will deliver even more storage savings.
Two examples of this are in the area of index compression and TEMP table compression. In DB2 Cobra, you will not only be able to compress tables but DB2 will also provide automatic compression for indexes as well. In fact there are a number of ways to compress indexes based on the type of index and the data distribution within that index. DB2 will be able to examine the index data and automatically chose from a set of compression algorithms that provides an optimal compression for the index.
Similarly DB2 will automatically compress temporary data. This can be a huge savings for data warehouse applications where large sorts and large intermediate result sets can consume a significant amount of storage. In DB2 Cobra, temporary tables used for sorting and intermediate query results will automatically be compressed if they consume a significant amount of storage thus saving you even more on your storage costs.
11. Index Compression Algorithms implemented by the Database Engine (under-the-covers):
RID List Compression, Prefix Compression, Variable Slot Directory
Activated:
When row compression is activated on a table
CREATE INDEX with the new “COMPRESS YES” option
ALTER INDEX COMPRESS [YES|NO] statement, followed by an index reorg
Savings
ADMIN_GET_INDEX_COMPRESS_INFO to estimate compression savings for uncompressed index
COMPRESS and PCTPAGESSAVED in the SYSINDEXES catalog
show if an index is defined as compressed and the percentage saved respectively
Typical Savings:
35% - 55%
Plans for a compression estimator to be used on pre-Db2 9.7 databases DB2 will use common prefix compression and RID list compression for index leaf pages, which represent the majority of the index space. A major benefit of compressing leaf pages is that there are less frequent page splits and smaller index levels. Because the number of index levels is reduced, an index search starting from the root page requires fewer I/O corresponding to the number of index levels. Similarly an index scan requires fewer I/O corresponding to the total number of leaf pages. Furthermore, more index keys can be cached in the buffer pool. The buffer pool hit ratios increase and the index page physical reads and writes decrease. On the other hand, since the index keys are now compressed, we will need extra CPU to uncompress the index keys back to the uncompressed format before returning the result back to the users, or to compress the index key to the compressed format before updating an index page.DB2 will use common prefix compression and RID list compression for index leaf pages, which represent the majority of the index space. A major benefit of compressing leaf pages is that there are less frequent page splits and smaller index levels. Because the number of index levels is reduced, an index search starting from the root page requires fewer I/O corresponding to the number of index levels. Similarly an index scan requires fewer I/O corresponding to the total number of leaf pages. Furthermore, more index keys can be cached in the buffer pool. The buffer pool hit ratios increase and the index page physical reads and writes decrease. On the other hand, since the index keys are now compressed, we will need extra CPU to uncompress the index keys back to the uncompressed format before returning the result back to the users, or to compress the index key to the compressed format before updating an index page.
12. Index Compression: Measurements
13. Compression Savings from SAP supplied tables
14. Temp Table Compression Compression of temporary tables aims to:
Reduce the amount of temporary disk space required
Have no performance penalty as a result of the extra processing required for row compression. If there is an increase in query performance, that is an added side benefit
Typical up to 40% reduction
Applicable to User temporary tables and System temps (DGTT/CGTT)
Sorts, HSJN, MGJN,NLJN, utilities, …
If Deep Compression is licensed, then temporary tables will be compressed by default
There is no additional action required by the user in order to use it. DB2 will evaluate the query and apply compression where appropriate.
db2pd will report on temp tablespace usage
15. Temp Compression: Measurements Test Details
Analysis was done on 11/99 TPC-DS Queries affected by temp compression
TPC-DS is the complex query workload with most temp activityTest Details
Analysis was done on 11/99 TPC-DS Queries affected by temp compression
TPC-DS is the complex query workload with most temp activity
16. XML (XDA) Compression In addition to XML Inlining
Approach will be very similar to that of Table Compression
Static dictionary approach
Classic/’Offline’ reorg table based
ADC applicable
No online dictionary support via DB2 INSPECT
Enablement is via the table COMPRESS attribute i.e. piggyback data table setting – no specific compress setting for XML alone
17. Inlining If a table possesses LOB (or XML) data that can be inlined, there are considerable benefits with respect to performance and storage
Performance
Whenever LOBs are inserted/retrieved, a disk I/O cost is incurred each time since this data is not buffered (unlike with base table data)
When inlined, this I/O cost is reduced since this data get buffered along with the base table data they are inlined with
Storage
Storage allocated to the storage object is reduced by inlining XML/LOB data in the base table (though base table storage increases)
Inlining small XML/LOBs can result in a noticeable decrease in net total storage since the decrease in storage size is greater than the increase in base table storage size
XML/LOBs inlined within the base table can be compressed Why inline?
There are considerable benefits with respect to performance and storage
(LOB==XML)
Unlike base table data, LOB data is not buffered
This translates to going to disk each time the LOB object is accessed
Insert LOB data, write to disk
Query for LOB data, read from disk
For inlined LOBs, the LOBs essentially become a part of the base table data and therefore become subject to buffering, reducing the I/O costs
Storage-wise, the amount of space allocated for the LOB object decreases, though the storage for the base table increases by a similar amount
However, depending on the size of your LOBs, the amount of storage decrease in the LOB object may be greater than the increase in base table storage size
This would result in a net total storage savings
Likely to happen for a table with small LOBs
The likelihood of this behaviour diminishes with larger and larger LOBs
For inlined LOBs, the data essentially becomes base table data and can therefore be compressed for storage savings (potentially – depends on the nature of the LOB data)Why inline?
There are considerable benefits with respect to performance and storage
(LOB==XML)
Unlike base table data, LOB data is not buffered
This translates to going to disk each time the LOB object is accessed
Insert LOB data, write to disk
Query for LOB data, read from disk
For inlined LOBs, the LOBs essentially become a part of the base table data and therefore become subject to buffering, reducing the I/O costs
Storage-wise, the amount of space allocated for the LOB object decreases, though the storage for the base table increases by a similar amount
However, depending on the size of your LOBs, the amount of storage decrease in the LOB object may be greater than the increase in base table storage size
This would result in a net total storage savings
Likely to happen for a table with small LOBs
The likelihood of this behaviour diminishes with larger and larger LOBs
For inlined LOBs, the data essentially becomes base table data and can therefore be compressed for storage savings (potentially – depends on the nature of the LOB data)
18. Index/TEMP Compression vs. Oracle and Microsoft Oracle and Microsoft support only prefix compression
With Oracle you have to specify the number of columns to prefix
CREATE INDEX ix1 ON t1(c1,c2,c3) COMPRESS 2;
DB2 is the only vendor to do RID list compression
DB2 is the only vendor to offer Temp Table compression
DB2 is the only vendor to deliver compression on inlined XML and LOB data
20. Schema Evolution: Example – updating a view definition Scenario:
A table is decomposed into two tables
A view (V1) must be updated accordingly:
DROP VIEW V1 RESTRICT
CREATE VIEW V1 AS (SELECT … FROM T1, T2 …)
Another view (V2) references V1
Problem:
The database will prevent the drop of V1 because it would make V2 invalid, even though the second statement would make V2 valid again.
21. Schema Evolution: The solution Relax the object dependency model
Allow changes that affect dependent objects to succeed
Automatically revalidate dependent objects
ALTER TABLE, ALTER COLUMN, DROP COLUMN, RENAME COLUMN
CREATE OR REPLACE ALIAS, FUNCTION, NICKNAME, PROCEDURE, TRIGGER, VARIABLE, VIEW
DROP FUNCTION, NICKNAME, PROCEDURE, SEQUENCE, TABLE, TRIGGER, TYPE, VARIABLE, VIEW, TABLE
Extend to support
RENAME COLUMN
Support CREATE OR REPLACE syntax for views, functions, triggers, etc.
Allow additional data type changes via ALTER COLUMN
Between any types SQL runtime can cast
22. Online Table Move ADMIN_MOVE_TABLE
Move data in an existing table to a new table object
Source table remains online: both read (select) and write (IUD) operations
Final phase renames the tables: target table will have the original table name
Use Cases
Online table compression
Online REORG or Redistribute
Online conversion to LARGE tablespaces
Move data/index/long data to new/different tablespaces
Support for limited schema evolution:
Add or remove columns, change column datatypes
Add/change MDC dimensions, range partitioning or partitioning key
23. Local (aka partitioned) indexes Can have both RID based local index as well as local MDC block indexCan have both RID based local index as well as local MDC block index
24. Maximize availability during maintenance DPn = Data Partition nDPn = Data Partition n
25. Maximize availability during maintenance DPn = Data Partition nDPn = Data Partition n
26. Transportable Schema – In Development Efficient schema movement between databases
Transport schema from a backup image
Performance objective – 100 GB in under 20 minutes
Restore will now do multiple operations
Restore the syscatspace and specified table spaces from the backup image
Roll them forward to a consistency point
Validate the schemas specified
Transfer ownership of the specified table spaces to the target DB
Recreate the schema in the targetDB
A small staging database consisting of the system catalog table space from the backup image will be created in the DFTDBPATH and dropped once the restore utility completes. This staging database is required to extract the DDL, to regenerate the objects that reside in the table space being transported. If an online backup image is used then the staging database will be rolled forward to the end of the backup. If an offline backup image is used then no rollforward processing will be performed
The table space / schema set being transported must consist of self contained transport sets or the restore will fail. Once the schemas have been validated, the relevant system catalog entries will be created in the target DB which describes the objects in the table spaces that were transported. Upon completion of the schema recreation, the target DB will take ownership of the physical table space containers.
STAGE IN
Specifies the name of a temporary database to be used for the rollforward recovery of the backup image being transported from. If this parameter is specified the temporary staging database will not be automatically dropped, but is no longer required after the transport has completed and can be dropped by the DBA.
If this parameter is not specified:
DB2 will automatically generate a unique database name of the form 'SYSTGxxx' where 'xxx' is an integer value. A database with this name will be created and used during the transport operation.
The temporary staging database will be dropped upon completion of the transport operation.
A small staging database consisting of the system catalog table space from the backup image will be created in the DFTDBPATH and dropped once the restore utility completes. This staging database is required to extract the DDL, to regenerate the objects that reside in the table space being transported. If an online backup image is used then the staging database will be rolled forward to the end of the backup. If an offline backup image is used then no rollforward processing will be performed
The table space / schema set being transported must consist of self contained transport sets or the restore will fail. Once the schemas have been validated, the relevant system catalog entries will be created in the target DB which describes the objects in the table spaces that were transported. Upon completion of the schema recreation, the target DB will take ownership of the physical table space containers.
STAGE IN
Specifies the name of a temporary database to be used for the rollforward recovery of the backup image being transported from. If this parameter is specified the temporary staging database will not be automatically dropped, but is no longer required after the transport has completed and can be dropped by the DBA.
If this parameter is not specified:
DB2 will automatically generate a unique database name of the form 'SYSTGxxx' where 'xxx' is an integer value. A database with this name will be created and used during the transport operation.
The temporary staging database will be dropped upon completion of the transport operation.
27. Lightning Fast for Transactions www.tpc.org as of 12/12/08
DB2 currently is 50% faster than Oracle on the tpc-c transcational benchmark
http://www.tpc.org/tpcc/results/tpcc_perf_results.aspwww.tpc.org as of 12/12/08
DB2 currently is 50% faster than Oracle on the tpc-c transcational benchmark
http://www.tpc.org/tpcc/results/tpcc_perf_results.asp
28. Intel x86 World Record – 1.2M TPM-C
29. Lightning Fast for Business Intelligence www.tpc.org as of 12/12/08
DB2 currently is 50% faster than Oracle on the tpc-c transcational benchmark
http://www.tpc.org/tpcc/results/tpcc_perf_results.aspwww.tpc.org as of 12/12/08
DB2 currently is 50% faster than Oracle on the tpc-c transcational benchmark
http://www.tpc.org/tpcc/results/tpcc_perf_results.asp
30. XML Transaction Processing DB2 9.7 on Intel Xeon 5570 delivers
Outstanding out-of-the-box performance TPoX benchmark results
Default db2 registry, db or dbm configuration changes
Excellent performance scalability of 1.78x from Xeon 5400 to Xeon 5500 series (2-socket, quad-core)
Performance per Watt improves by 1.52x Hardware cconfiguration:
Harpertown (Intel Xeon 5400, 3.16GHz, 2 sockets, eight cores, 32GB RAM)
Nehalem (Intel Xeon 5500, 2.93 GHz, 2 socket, eight cores, 48GB RAM, hyperthreading enabled, turbo-mode enabled)
Both used the following storage configuration:
1 Internal Drive +
120 Disk RAID0 Setup contains
the DB2 database,
15 Disk RAID0 for logging
30 Disk RAID0 for raw XML data
The software configurations was:
DB2 9.7, s090430
SLES 10 SP2 (2.6.16.60-0.21-smp)
TPoX 1.2 MX
10000 SF (1TB)
Default DB2 environment variables
Open Source Benchmark: TPoX (Transaction Processing over XML Data)
http://tpox.sourceforge.net/
1TB raw data, 360M XML Documents
70% Queries, 30% Insert/Update/Delete
Intel® Xeon® Processor 5400 Series and Intel® Xeon® Processor 5500 Series are trademarks of Intel Corporation in the U.S. and other countries.
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations at http://www.intel.com/performance/resources/limits.htm.
Hardware cconfiguration:
Harpertown (Intel Xeon 5400, 3.16GHz, 2 sockets, eight cores, 32GB RAM)
Nehalem (Intel Xeon 5500, 2.93 GHz, 2 socket, eight cores, 48GB RAM, hyperthreading enabled, turbo-mode enabled)
Both used the following storage configuration:
1 Internal Drive +
120 Disk RAID0 Setup contains
the DB2 database,
15 Disk RAID0 for logging
30 Disk RAID0 for raw XML data
The software configurations was:
DB2 9.7, s090430
SLES 10 SP2 (2.6.16.60-0.21-smp)
TPoX 1.2 MX
10000 SF (1TB)
Default DB2 environment variables
Open Source Benchmark: TPoX (Transaction Processing over XML Data)
http://tpox.sourceforge.net/
1TB raw data, 360M XML Documents
70% Queries, 30% Insert/Update/Delete
Intel® Xeon® Processor 5400 Series and Intel® Xeon® Processor 5500 Series are trademarks of Intel Corporation in the U.S. and other countries.
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations at http://www.intel.com/performance/resources/limits.htm.
31. DB2 9.7 Delivers Even Faster OLTP with Statement Concentrator
DB2 9.7
Optionally replace literals with parameter markers
Increases section sharing and reduces compilation
Reduces number of statements to be compiled
32. Multiple Scanners pre DB2 9.7
33. Multiple Scanners with DB2 9.7
34. Scan Sharing Performance Test TPCH Q1 : CPU Intensive, Slow Query On Lineitem Table Using A Table Scan
TPCH Q6 : IO Intensive, Fast Query On Lineitem Table Using A Table Scan
Test Scenario : Queries executed in parallel in the following sequence
Results : 34% Improvement In End to End Timing
35. Lower Storage Costs with Deep Compression Best in industry
Minimize storage costs
Improve performance
Easy to implement DB2 9.5 has industry leading data compression technology that is transparent to DBAs, users, and applications. We call it “Deep Compression”. DB2 automatically builds a compression dictionary, and uses it so that the most frequently repeated values in data rows are compressed. These repeating values can span columns, which makes the compression even more effective. The way it works is that these repeating patterns within the table are replaced by a smaller symbol, and the real values are then stored in the compression dictionary. In addition to reducing disk storage space, it can also lead to disk I/O savings since more rows are accessed per I/O operation which in turn leads to faster query performance. Deep Compression can also enable more data to be cached in the buffer pool, thereby leading to increased bufferpool hit ratios which again can improve performance.
Customers such as SunTrust have reported compression rates over 80% which as you can imagine resulted in significant storage cost savings! In the SunTrust case the initial savings were more than $2 million which is a great example of how DB2 can lower your total cost of ownership!
DB2 9.5 has industry leading data compression technology that is transparent to DBAs, users, and applications. We call it “Deep Compression”. DB2 automatically builds a compression dictionary, and uses it so that the most frequently repeated values in data rows are compressed. These repeating values can span columns, which makes the compression even more effective. The way it works is that these repeating patterns within the table are replaced by a smaller symbol, and the real values are then stored in the compression dictionary. In addition to reducing disk storage space, it can also lead to disk I/O savings since more rows are accessed per I/O operation which in turn leads to faster query performance. Deep Compression can also enable more data to be cached in the buffer pool, thereby leading to increased bufferpool hit ratios which again can improve performance.
Customers such as SunTrust have reported compression rates over 80% which as you can imagine resulted in significant storage cost savings! In the SunTrust case the initial savings were more than $2 million which is a great example of how DB2 can lower your total cost of ownership!
36. Automated Compression for All Form of Ingest Not only does DB2 compression save you storage space and deliver better performance, with DB2 9.5 you also get a completely automated solution.
All that is required from a database administrator is to indicate to DB2 which tables in the database should be considered “compression eligible” and DB2 will do the rest. As data is added to these tables (either through insert, update, load, or import operations), DB2 will monitor the size of the table. If and when a table exceeds a given threshold size (as determined by DB2) a compression dictionary will be created automatically on the fly and any new data added to the table will be automatically compressed.Not only does DB2 compression save you storage space and deliver better performance, with DB2 9.5 you also get a completely automated solution.
All that is required from a database administrator is to indicate to DB2 which tables in the database should be considered “compression eligible” and DB2 will do the rest. As data is added to these tables (either through insert, update, load, or import operations), DB2 will monitor the size of the table. If and when a table exceeds a given threshold size (as determined by DB2) a compression dictionary will be created automatically on the fly and any new data added to the table will be automatically compressed.
37. IBM InfoSphere™ Balanced Warehouse Pre-built, Pre-packaged Warehouse Appliance Offering Reduces Risk
Standard Pre-tested Configuration
Ongoing Supported Package
Predictable performance
38. InfoSphere Warehouse is fully certified and ready for production use of Solid State Drives (SSD)
Solid State Drives enable break through IO performance for transactional and data warehousing environments
DB2’s Deep Compression further enhances the applicability of SSD flash
DB2 optimizes for SSD devices with its industry-leading cost-based optimizer for improved query performance
InfoSphere Balanced Warehouse exploiting flash in the near future
39. IBM InfoSphere WarehouseTM - Powered by DB2 Addressing the needs for Dynamic Warehousing InfoSphere Warehouse Design Studio – Development environment for warehousing solutions
Physical data modeling, DB2 SQL-based warehouse construction, OLAP cube modeling, and Data mining modeling
Reporting and Data Mining on the single version of the truth
Modeling, Visualization, and Scoring
Support for Discovery and Predictive mining, and Multiple mining methods
New analytical algorithms and wizard based data preparation for modeling
Cubing Services for OLAP
High Performanance 64-bit ROLAP solution with unlimited scalability
Enhanced cubing model and cube security
Integrated MDX/XMLA provider
Supports maintream MDX client tools. ex: Excel
Cognos 8 integration
SQL Warehousing Tool
Integrated DB2-based environment for life cycle of warehouse building and maintenance activities
Rapid deployment of data movement processes
Unstructured Data Analysis
Extract Structured data and correlate with Structured data to gain business insight
- Reporting and data mining on a single version of the truth
- Design Studio Mining Editor - From Design to embedded applications -
- Seamless integration of data modeling, SQL, mining in a common Data Warehouse project
- Data exploration components give you an overview on the content and quality of your data.
- Interactive flow editor: Stepwise execution. All SQL operators can be reused in the mining context
- Two types of mining in InfoSphere Warehouse
- Discovery Mining - Finding patterns in data that may be used to guide decisions.
- Predictive Mining- Using known results to create models that can predict value
InfoSphere Warehouse mining methods - Associations, sequences, Clustering, Classification, regression, ...
Modeling. The data mining process starts with historical data being gathered and put through a series of mathematical functions to derive business rules such as: "Customers buying Gouda cheese are likely to also buy mixed nuts 18% of the time." The business rules are collected together into a Model. A model can have a few rule or tens of thousands of rules.
Visualization. The business rules must be analyzed and verified by a business analyst to ensure the rules are accurate. IBM offers a unique Visualization tool to assist analysts in evaluating the business rules.
Scoring. The verified business rules can be applied to new data to determine the appropriate predicted outcome. For example, a new bank transaction enters the system and the fraud detection rules are applied against the data. The rules will predict the probability that the record is fraudulent. This process of applying the business rules is called Scoring. Scoring in real time allows businesses to catch fraudulent records faster, segment new customers and offer them better service, and detect defects quicker.
InfoSphere Warehouse Cubing Services - ROLAP Accelerator
- Relational data represented dimensionally with planned query optimization
- Provides highest data volumes, dimensions & user communities
- Ideal for IT enterprise rollouts founded on InfoSphere Warehouse where volumes & scalability are more critical than query performance consistency
- High Performance ROLAP solution - Near real time data access with high speed performance - 64bit
- Unlimited scalability - Support many users with OLAP queries from many data sources
- Integrated IT Tooling - Lower cost of ownership bringing design, query, and performance together
- Cognos 8 integration with cubing services
- XMLA Provider Support
MDX access to cubing services - ie: Excel, Cognos, Alphablox, etc..
The Design Studio tooling enables you to create, edit, import, export, and deploy OLAP models over the relational warehouse schema.
Easy to use wizards provide optimization recommendations to dramatically improve the performance of OLAP applications
Integrated Data Management Environment with IBM Data Studio
- Develop - logical data modeling, physical data modeling, Java Development, Routine Development, Query tuning, etc..
- Protect and Govern - Data masking, data access management, data encryption, etc..
- Deploy - Change Management
- Archive - Data archiving, disposal
Administer - Database Administration, Data movement and change, Object management, perfrmance management, etc..
Example of new analytical algoritms is Mining time series analysis
- Reporting and data mining on a single version of the truth
- Design Studio Mining Editor - From Design to embedded applications -
- Seamless integration of data modeling, SQL, mining in a common Data Warehouse project
- Data exploration components give you an overview on the content and quality of your data.
- Interactive flow editor: Stepwise execution. All SQL operators can be reused in the mining context
- Two types of mining in InfoSphere Warehouse
- Discovery Mining - Finding patterns in data that may be used to guide decisions.
- Predictive Mining- Using known results to create models that can predict value
InfoSphere Warehouse mining methods - Associations, sequences, Clustering, Classification, regression, ...
Modeling. The data mining process starts with historical data being gathered and put through a series of mathematical functions to derive business rules such as: "Customers buying Gouda cheese are likely to also buy mixed nuts 18% of the time." The business rules are collected together into a Model. A model can have a few rule or tens of thousands of rules.
Visualization. The business rules must be analyzed and verified by a business analyst to ensure the rules are accurate. IBM offers a unique Visualization tool to assist analysts in evaluating the business rules.
Scoring. The verified business rules can be applied to new data to determine the appropriate predicted outcome. For example, a new bank transaction enters the system and the fraud detection rules are applied against the data. The rules will predict the probability that the record is fraudulent. This process of applying the business rules is called Scoring. Scoring in real time allows businesses to catch fraudulent records faster, segment new customers and offer them better service, and detect defects quicker.
InfoSphere Warehouse Cubing Services - ROLAP Accelerator
- Relational data represented dimensionally with planned query optimization
- Provides highest data volumes, dimensions & user communities
- Ideal for IT enterprise rollouts founded on InfoSphere Warehouse where volumes & scalability are more critical than query performance consistency
- High Performance ROLAP solution - Near real time data access with high speed performance - 64bit
- Unlimited scalability - Support many users with OLAP queries from many data sources
- Integrated IT Tooling - Lower cost of ownership bringing design, query, and performance together
- Cognos 8 integration with cubing services
- XMLA Provider Support
MDX access to cubing services - ie: Excel, Cognos, Alphablox, etc..
The Design Studio tooling enables you to create, edit, import, export, and deploy OLAP models over the relational warehouse schema.
Easy to use wizards provide optimization recommendations to dramatically improve the performance of OLAP applications
Integrated Data Management Environment with IBM Data Studio
- Develop - logical data modeling, physical data modeling, Java Development, Routine Development, Query tuning, etc..
- Protect and Govern - Data masking, data access management, data encryption, etc..
- Deploy - Change Management
- Archive - Data archiving, disposal
Administer - Database Administration, Data movement and change, Object management, perfrmance management, etc..
Example of new analytical algoritms is Mining time series analysis
40. Lightning Fast & Optimized for SAP Better Performance with DB2
Average 40% better performance
World record SAP benchmark, outperforming Oracle by 18% and needing only half the processing power!
Lower TCO with DB2
Average 40% Storage Savings
Average 25% DBA Labor Savings
Deep sales and development partnership leading to great results
33% Growth in SAP clients choosing DB2 since 2006
>100 Customers migrated from Oracle to DB2 in past 12 months <Remove slide if customer does not use SAP><Remove slide if customer does not use SAP>
41. Embrace XML with Industry-Leading XML Storage Easy development and integration
No data transformations for XML
No complex schemas
No parsing upon retrieval
Efficient storage
Requires only 440GB of raw storage for 1TB XML benchmark
Untouchable performance
6,763 XML transactions per second for 1TB XML benchmark Many relational database vendors in today’s competitive marketplace claim to provide “native” XML storage engines, yet in most cases, the XML data is either pushed into a large object (LOB) or shredded into a table. DB2 9.5 is different, it truly does store XML data in its native format and for that reason we call it pureXML. By combining pureXML together with the familiar relational model, DB2 9.5 can be considered a hybrid database and supports a new emerging model for developers. The XML developer will see a sophisticated XML repository that also supports SQL, while the SQL developer will see a sophisticated relational database that also supports XML. The best of both worlds! Furthermore, when compared to standalone XML storage solutions, DB2 provides advantages in terms of scalability and robustness with mature database services such as backup, recovery, and indexing.
DB2 also added XQuery and extensions to SQL capabilities to efficiently search and update documents; and XML specific indexing to ensure high performance for those queries. Applications can leverage these facilities to manage and access their XML data efficiently along with any existing relational data using either SQL or XQuery as the query interface depending on their application needs and preferences.
DB2 is clearly the leader in this space, enabling rapid development of new types of applications that can combine XML and relational data.Many relational database vendors in today’s competitive marketplace claim to provide “native” XML storage engines, yet in most cases, the XML data is either pushed into a large object (LOB) or shredded into a table. DB2 9.5 is different, it truly does store XML data in its native format and for that reason we call it pureXML. By combining pureXML together with the familiar relational model, DB2 9.5 can be considered a hybrid database and supports a new emerging model for developers. The XML developer will see a sophisticated XML repository that also supports SQL, while the SQL developer will see a sophisticated relational database that also supports XML. The best of both worlds! Furthermore, when compared to standalone XML storage solutions, DB2 provides advantages in terms of scalability and robustness with mature database services such as backup, recovery, and indexing.
DB2 also added XQuery and extensions to SQL capabilities to efficiently search and update documents; and XML specific indexing to ensure high performance for those queries. Applications can leverage these facilities to manage and access their XML data efficiently along with any existing relational data using either SQL or XQuery as the query interface depending on their application needs and preferences.
DB2 is clearly the leader in this space, enabling rapid development of new types of applications that can combine XML and relational data.
42. XML on DPF: Scalability Each query run in 2 or 3 equivalent variants:
Completely relational (“rel”)
Completely XML (“xml”)
XML extraction/predicates with relational joins (“xmlrel”) (join queries only)
XML SCALES AS WELL AS RELATIONAL
Test Environment: DPF Star-schema database with XML, 33M row fact table, 4 and 8 partitions on 12-way AIX machine.
Two fact tables (one 33M rows, one smaller), 5 dimensions. Largest dimension 1M rows. All tables with XML.
Test database is the same size in both cases -> expect a maximum of 2x speedup from 4 to 8 partitions; possibly more/less due to plan changes
Two query sets:
Simple: Single-table queries and updates, 2-table joins (left chart)
Complex: Multi-table data warehouse queries (right chart)
Three “flavors” of each query: (two if nonjoin)
Relational
XML, including XML joins
XML, but with relational joins (recommended for DPF to take advantage of co-location)
Speedup Results (Optimally, queries run 2x faster from 4 to 8 partitions)
Simple query speedup: Relational avg = 1.77x, XMLrel avg = 1.70x
Complex query speedup: XML avg= 1.67, XMLrel avg = 1.9. [Speedup influenced by differing execution plans]
Examples of the simple queries:
-- count on indexed column/element
-- rel
select count(*) from senglert.daily_sales where prodkey < 2000;
-- xml
select count(*) from senglert.daily_sales where xmlexists ('$SALDOC/fact/keys/prodkey[.< 2000]');
-- grouped aggregation
-- rel
select distinct(tax), count(*) from daily_sales where perkey = 1996001 group by tax;
-- xml
select distinct(sx.tax), count(*) from daily_sales s, xmltable('$SALDOC/fact/measures/tax' columns tax integer path '.') as sx
where perkey = 1996001 group by sx.tax;
-- colocated join (colocation does not matter in XML)
-- rel
SELECT p.category_desc, SUM(s.quantity_sold), SUM(s.shelf_cost), SUM(s.bin_cost_pct_of_sale) FROM
senglert.PRODUCT p, senglert.DAILY_SALES s where p.category < 30 and s.perkey = 1996001 and
p.PRODKEY=s.PRODKEY GROUP BY p.category_desc;
-- xml
SELECT px.category_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale)
FROM senglert.PRODUCT p, senglert.DAILY_SALES s,
xmltable('$SALDOC/fact[keys/perkey = 1996001]/measures' columns quantity_sold integer path
'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost',
bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale') as sx,
xmltable('$PRDDOC/product[category < 30]/category_desc' columns category_desc char(70) path '.') as px
where xmlexists('$SALDOC/fact[keys/prodkey/xs:double(.)=$PRDDOC/product/prodkey/xs:double(.)]')
GROUP BY px.category_desc;
-- xmlrel
SELECT px.category_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM
senglert.PRODUCT p, senglert.DAILY_SALES s,
xmltable('$SALDOC/fact[keys/perkey=1996001]/measures'
columns
quantity_sold integer path 'quantity_sold',
shelf_cost decimal(9,2) path 'shelf_cost',
bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale'
) as sx,
xmltable('$PRDDOC/product[category < 30]/category_desc'
columns
category_desc char(70) path '.'
) as px
where
p.PRODKEY=s.PRODKEY
GROUP BY px.category_desc;
Examples of the complex queries:
-- rel
SELECT ITEM_DESC, SUM(QUANTITY_SOLD),
SUM(SHELF_COST),
SUM(BIN_COST_PCT_OF_SALE)
FROM PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro
WHERE
per.PERKEY=s.PERKEY AND
prd.PRODKEY=s.PRODKEY AND
st.STOREKEY=s.STOREKEY AND
pro.PROMOKEY=s.PROMOKEY AND
CALENDAR_DATE BETWEEN '01/01/1996' AND '01/04/1996' AND
STORE_NUMBER IN ('02', '03') AND
PROMODESC IN ('Advertisement', 'Coupon', 'Manager''s Special', 'Overstocked Items') AND
CATEGORY=42
GROUP BY ITEM_DESC;
-- xml
SELECT px.item_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale)
FROM
PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro,
xmltable('$SALDOC/fact/measures'
columns quantity_sold integer path 'quantity_sold',
shelf_cost decimal(9,2) path 'shelf_cost',
bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale'
) as sx,
xmltable('$PRDDOC/product[category=42]/item_desc'
columns item_desc char(30) path '.') as px
where
xmlexists('$PERDOC/period/calendar_date[. >= xs:date("1996-01-01") and . <= xs:date("1996-01-04")]') and
xmlexists('$STODOC/store[store_number = "02" or store_number = "03"]') and
xmlexists('$PRODOC/promotion[promodesc = "Advertisement" or promodesc = "Coupon" or
promodesc = "Manager''s Special" or promodesc = "Overstocked Items"]' ) and
xmlexists('$SALDOC/fact/keys[perkey/xs:double(.) = $PERDOC/period/perkey/xs:double(.)]') and
xmlexists('$SALDOC/fact/keys[prodkey/xs:double(.) = PRDDOC/product/prodkey/xs:double(.)]') and
xmlexists('$SALDOC/fact/keys[storekey/xs:double(.) = $STODOC/store/storekey/xs:double(.)]') and
xmlexists('$SALDOC/fact/keys[promokey/xs:double(.) = $PRODOC/promotion/promokey/xs:double(.)]')
GROUP BY px.ITEM_DESC;
--xmlrel
SELECT px.item_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM
PERIOD per,
DAILY_SALES s,
PRODUCT prd,
STORE st,
PROMOTION pro,
xmltable('$SALDOC/fact/measures'
columns
quantity_sold integer path 'quantity_sold',
shelf_cost decimal(9,2) path 'shelf_cost',
bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale'
) as sx,
xmltable('$PRDDOC/product[category=42]/item_desc'
columns
item_desc char(30) path '.'
) as px
where
xmlexists('$PERDOC/period/calendar_date[. >= xs:date("1996-01-01") and . <= xs:date("1996-01-04")]')
and
xmlexists('$STODOC/store[store_number = "02" or store_number = "03" ]')
and
xmlexists('$PRODOC/promotion[promodesc = "Advertisement" or promodesc = "Coupon" or promodesc = "Manager''s Special"
or promodesc = "Overstocked Items"]' )
and
per.PERKEY=s.PERKEY AND
prd.PRODKEY=s.PRODKEY AND
st.STOREKEY=s.STOREKEY AND
pro.PROMOKEY=s.PROMOKEY
GROUP BY px.ITEM_DESC;
Data collected by Susanne Englert
Test Environment: DPF Star-schema database with XML, 33M row fact table, 4 and 8 partitions on 12-way AIX machine.
Two fact tables (one 33M rows, one smaller), 5 dimensions. Largest dimension 1M rows. All tables with XML.
Test database is the same size in both cases -> expect a maximum of 2x speedup from 4 to 8 partitions; possibly more/less due to plan changes
Two query sets:
Simple: Single-table queries and updates, 2-table joins (left chart)
Complex: Multi-table data warehouse queries (right chart)
Three “flavors” of each query: (two if nonjoin)
Relational
XML, including XML joins
XML, but with relational joins (recommended for DPF to take advantage of co-location)
Speedup Results (Optimally, queries run 2x faster from 4 to 8 partitions)
Simple query speedup: Relational avg = 1.77x, XMLrel avg = 1.70x
Complex query speedup: XML avg= 1.67, XMLrel avg = 1.9. [Speedup influenced by differing execution plans]
Examples of the simple queries:
-- count on indexed column/element
-- rel
select count(*) from senglert.daily_sales where prodkey < 2000;
-- xml
select count(*) from senglert.daily_sales where xmlexists ('$SALDOC/fact/keys/prodkey[.< 2000]');
-- grouped aggregation
-- rel
select distinct(tax), count(*) from daily_sales where perkey = 1996001 group by tax;
-- xml
select distinct(sx.tax), count(*) from daily_sales s, xmltable('$SALDOC/fact/measures/tax' columns tax integer path '.') as sx
where perkey = 1996001 group by sx.tax;
-- colocated join (colocation does not matter in XML)
-- rel
SELECT p.category_desc, SUM(s.quantity_sold), SUM(s.shelf_cost), SUM(s.bin_cost_pct_of_sale) FROM
senglert.PRODUCT p, senglert.DAILY_SALES s where p.category < 30 and s.perkey = 1996001 and
p.PRODKEY=s.PRODKEY GROUP BY p.category_desc;
-- xml
SELECT px.category_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale)
FROM senglert.PRODUCT p, senglert.DAILY_SALES s,
xmltable('$SALDOC/fact[keys/perkey = 1996001]/measures' columns quantity_sold integer path
'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost',
bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale') as sx,
xmltable('$PRDDOC/product[category < 30]/category_desc' columns category_desc char(70) path '.') as px
where xmlexists('$SALDOC/fact[keys/prodkey/xs:double(.)=$PRDDOC/product/prodkey/xs:double(.)]')
GROUP BY px.category_desc;
-- xmlrel
SELECT px.category_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM
senglert.PRODUCT p, senglert.DAILY_SALES s,
xmltable('$SALDOC/fact[keys/perkey=1996001]/measures'
columns
quantity_sold integer path 'quantity_sold',
shelf_cost decimal(9,2) path 'shelf_cost',
bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale'
) as sx,
xmltable('$PRDDOC/product[category < 30]/category_desc'
columns
category_desc char(70) path '.'
) as px
where
p.PRODKEY=s.PRODKEY
GROUP BY px.category_desc;
Examples of the complex queries:
-- rel
SELECT ITEM_DESC, SUM(QUANTITY_SOLD),
SUM(SHELF_COST),
SUM(BIN_COST_PCT_OF_SALE)
FROM PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro
WHERE
per.PERKEY=s.PERKEY AND
prd.PRODKEY=s.PRODKEY AND
st.STOREKEY=s.STOREKEY AND
pro.PROMOKEY=s.PROMOKEY AND
CALENDAR_DATE BETWEEN '01/01/1996' AND '01/04/1996' AND
STORE_NUMBER IN ('02', '03') AND
PROMODESC IN ('Advertisement', 'Coupon', 'Manager''s Special', 'Overstocked Items') AND
CATEGORY=42
GROUP BY ITEM_DESC;
-- xml
SELECT px.item_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale)
FROM
PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro,
xmltable('$SALDOC/fact/measures'
columns quantity_sold integer path 'quantity_sold',
shelf_cost decimal(9,2) path 'shelf_cost',
bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale'
) as sx,
xmltable('$PRDDOC/product[category=42]/item_desc'
columns item_desc char(30) path '.') as px
where
xmlexists('$PERDOC/period/calendar_date[. >= xs:date("1996-01-01") and . <= xs:date("1996-01-04")]') and
xmlexists('$STODOC/store[store_number = "02" or store_number = "03"]') and
xmlexists('$PRODOC/promotion[promodesc = "Advertisement" or promodesc = "Coupon" or
promodesc = "Manager''s Special" or promodesc = "Overstocked Items"]' ) and
xmlexists('$SALDOC/fact/keys[perkey/xs:double(.) = $PERDOC/period/perkey/xs:double(.)]') and
xmlexists('$SALDOC/fact/keys[prodkey/xs:double(.) = PRDDOC/product/prodkey/xs:double(.)]') and
xmlexists('$SALDOC/fact/keys[storekey/xs:double(.) = $STODOC/store/storekey/xs:double(.)]') and
xmlexists('$SALDOC/fact/keys[promokey/xs:double(.) = $PRODOC/promotion/promokey/xs:double(.)]')
GROUP BY px.ITEM_DESC;
--xmlrel
SELECT px.item_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM
PERIOD per,
DAILY_SALES s,
PRODUCT prd,
STORE st,
PROMOTION pro,
xmltable('$SALDOC/fact/measures'
columns
quantity_sold integer path 'quantity_sold',
shelf_cost decimal(9,2) path 'shelf_cost',
bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale'
) as sx,
xmltable('$PRDDOC/product[category=42]/item_desc'
columns
item_desc char(30) path '.'
) as px
where
xmlexists('$PERDOC/period/calendar_date[. >= xs:date("1996-01-01") and . <= xs:date("1996-01-04")]')
and
xmlexists('$STODOC/store[store_number = "02" or store_number = "03" ]')
and
xmlexists('$PRODOC/promotion[promodesc = "Advertisement" or promodesc = "Coupon" or promodesc = "Manager''s Special"
or promodesc = "Overstocked Items"]' )
and
per.PERKEY=s.PERKEY AND
prd.PRODKEY=s.PRODKEY AND
st.STOREKEY=s.STOREKEY AND
pro.PROMOKEY=s.PROMOKEY
GROUP BY px.ITEM_DESC;
Data collected by Susanne Englert
43. Real-time data access challenge for businesses Businesses ranging from telecom providers to financial servicesto web retailers are differentiating themselves by delivering new applications to offer their customers new advanced services
To ensure the highest quality of service, these businesses must have extremely fast data access and continuous data availability
In-memory database software can be a vital technology for any application that must instantly and reliably supply data while ensuring data consistency
Businesses ranging from telecom providers to financial servicesto web retailers are differentiating themselves by delivering new applications to offer their customers new advanced services
To ensure the highest quality of service, these businesses must have extremely fast data access and continuous data availability
In-memory database software can be a vital technology for any application that must instantly and reliably supply data while ensuring data consistency
44. Announcing IBM solidDB Universal Cache Our Universal cache is now a leading product in the industry, since it is the first to provide in-memory caching technology that accelerates access to data in most industry leading on disk databases, increasing the performance of these up to ten times.
We have the extreme speed of our in memory technology, and we are able to achieve extremely high throughput and extremely low response time.
In terms of adaptability, solidDB cache uses the familiar SQL language, it has many options for schema mapping and data transformation and the performance scales both vertically and horizontally.
solidDB is ACID compliant, and so it preserves data durability. In addition it has a powerful HA implementation that provides subsecond failover.
We’ll discuss all these things in more detail in the next slides.Our Universal cache is now a leading product in the industry, since it is the first to provide in-memory caching technology that accelerates access to data in most industry leading on disk databases, increasing the performance of these up to ten times.
We have the extreme speed of our in memory technology, and we are able to achieve extremely high throughput and extremely low response time.
In terms of adaptability, solidDB cache uses the familiar SQL language, it has many options for schema mapping and data transformation and the performance scales both vertically and horizontally.
solidDB is ACID compliant, and so it preserves data durability. In addition it has a powerful HA implementation that provides subsecond failover.
We’ll discuss all these things in more detail in the next slides.
45. DB2 Virtual Appliances simplify a complex world
46. DB2 9.7: Use Industry SQL Skills with DB2 DB2 now supports other vendor’s SQL statements, including PL/SQL
Easy for your developers to use DB2
Easy for your applications to use DB2
DB2 supports:
SQL PL
SQL/PSM
PL/SQL
Packages
Built-in packages
JDBC
SQL*Plus scripts
47. New Features… Support other vendor’s SQL
? Easy for developers to query DB2
Native execution of all SQL statements
? Fast performance for queries
Easily import other vendor’s schemas
? Easy for developers to set up DB2
Support other vendor’s concurrency models
? Easy for developers to use DB2
Support flexible data typing
? Easy for developers to work with DB2
And more…
48. Babylonian Confusion (aka Lock-In)
50. PL/SQL in DB2 9.7 Built in PL/SQL native compiler
Source level debugging and profiling Both compiles part of core DB2 engine. Produce equivalent quality runtime code.
No preference.
Tooling/Monitoring hooks in with run time code. So no problems.
Both compiles part of core DB2 engine. Produce equivalent quality runtime code.
No preference.
Tooling/Monitoring hooks in with run time code. So no problems.
51. SQL Procedure Language (SQL PL) enhancements Advancements in DB2 PL/SQL
New SQL, stored procedures and triggers Objective: Explain how is it possible for DB2 to actually support PL/SQL!
First we summarized the main problems: (1)semantics and (2) syntax differences between PL/SQL and SQL PL languages.
The first element in the solution is that DB2 introduced NATIVE support to most of the Oracle features that used to be a pain during the porting process. This is to show that DB2 has added many new features to Cobra that close the gap between Oracle and DB2 capabilities. Thus, showing that we didn’t just embedded our MTK and emulate the Oracle features like before.
The second element and final ingredient solves the syntax problem… Even though you could convert your PL/SQL to the SQL PL syntax… why bother? With the PL/SQL compiler you can keep a single source and reduce even more the porting costs of your application and future versions.
In the next slides we’ll explore in more details the PL/SQL support in DB2 along with the new features that make it possible.
PL/SQL made its first appearance in Oracle Forms v3. A few years later, it was included in the Oracle Database server v7 (as database procedures, functions, packages, triggers and anonymous blocks) followed by Oracle Reports v2.
1988: Oracle RDBMS version 6 came out with support for PL/SQL embedded within Oracle Forms v3 (version 6 could not store PL/SQL in the database proper),
1992: Oracle version 7 appeared with support for referential integrity, stored procedures and triggers.
As of DB2 Version 7.2, a subset of SQL PL is supported in SQL functions and trigger bodies. ~2001
Objective: Explain how is it possible for DB2 to actually support PL/SQL!
First we summarized the main problems: (1)semantics and (2) syntax differences between PL/SQL and SQL PL languages.
The first element in the solution is that DB2 introduced NATIVE support to most of the Oracle features that used to be a pain during the porting process. This is to show that DB2 has added many new features to Cobra that close the gap between Oracle and DB2 capabilities. Thus, showing that we didn’t just embedded our MTK and emulate the Oracle features like before.
The second element and final ingredient solves the syntax problem… Even though you could convert your PL/SQL to the SQL PL syntax… why bother? With the PL/SQL compiler you can keep a single source and reduce even more the porting costs of your application and future versions.
In the next slides we’ll explore in more details the PL/SQL support in DB2 along with the new features that make it possible.
PL/SQL made its first appearance in Oracle Forms v3. A few years later, it was included in the Oracle Database server v7 (as database procedures, functions, packages, triggers and anonymous blocks) followed by Oracle Reports v2.
1988: Oracle RDBMS version 6 came out with support for PL/SQL embedded within Oracle Forms v3 (version 6 could not store PL/SQL in the database proper),
1992: Oracle version 7 appeared with support for referential integrity, stored procedures and triggers.
As of DB2 Version 7.2, a subset of SQL PL is supported in SQL functions and trigger bodies. ~2001
52. What Runs Directly in DB2 9.7? Want to have 90% of the objects work 100% of the time (no change).
(as opposed of 100% of the objects require 10% of change)
Goal for ISV: “One source”
JDBC: Oracle has made extensions to support their proprietary types. E.g. ref-cursors or VARRAY
PACKGAE: Not to be confused with DB2 PACKGAGE. PACKAGE == MODULE
Want to have 90% of the objects work 100% of the time (no change).
(as opposed of 100% of the objects require 10% of change)
Goal for ISV: “One source”
JDBC: Oracle has made extensions to support their proprietary types. E.g. ref-cursors or VARRAY
PACKGAE: Not to be confused with DB2 PACKGAGE. PACKAGE == MODULE
53. Concurrency and DB2 9.7 Oracle knows only two isolation levels:
Transaction level: New snapshot taken at beg of TX
Statement level: New snapshot taken at beg of stmt
Vast majority of code written in Statement Level ISO (DEFAULT).
So statement that Oracle gives consistent data only true for single statement transaction. Otherwise similar risk at inconsistent results as DB2 RS or CS.
Oracle rollback segments are logged => Double the logging and IO for every UDI as compared to DB2
“No free lunch”
Oracle knows only two isolation levels:
Transaction level: New snapshot taken at beg of TX
Statement level: New snapshot taken at beg of stmt
Vast majority of code written in Statement Level ISO (DEFAULT).
So statement that Oracle gives consistent data only true for single statement transaction. Otherwise similar risk at inconsistent results as DB2 RS or CS.
Oracle rollback segments are logged => Double the logging and IO for every UDI as compared to DB2
“No free lunch”
54. Concurrency Control in DB2 9.7 Reads the currently committed version of a row
If uncommitted row-change found use currently committed version
Log based
No management overhead
No performance overhead
No wasted memory/storage (no undo tablespace) Assumption is that in most cases an uncommitted change has not been written to disk yet => Log lookup is a memory lookup.
May have been flushed however in which case log record is still close (but on disk) and only 1 I/O away.
Note: In Oracle I/O needs to be done as well to reassemble version. Perhaps multiple as version ages.Also versioning is at page level (May need to look up rollback segment for rows that have not changed!)
Assumption is that in most cases an uncommitted change has not been written to disk yet => Log lookup is a memory lookup.
May have been flushed however in which case log record is still close (but on disk) and only 1 I/O away.
Note: In Oracle I/O needs to be done as well to reassemble version. Perhaps multiple as version ages.Also versioning is at page level (May need to look up rollback segment for rows that have not changed!)
55. Currently Committed Competitive Advantage Only incur added processing when a reader and writer are working on the same row
No added overhead for a “just in case” collision
With Oracle past images are stored in the undo tablespace just in case there is a collision
DB2 uses existing log infrastructure to retrieve currently committed data in flight
Better performance
Lower overhead
Simplified management
56. DB2's PL/SQL compatibility is excellent. We’re looking forward to integrating the current dual source code base into a single one. This will increase our development and testing productivity.
The compatibility level that DB2 Cobra achieved is also amazing. Now we can stop our program’s different behaviors, which comes from DBMS’s differences, and this will help us improve the quality of our package. DB2 9.7 Praise for SQL Compatibility
57. DB2 9.7: % of Supported PL/SQL Statements
58. Openbravo enabled with DB2 9.7 Map schema and data types
Native support
Move data
Run your shared code
Native support – NO EMULATION!!
Minor adjustments (Some few exceptions)
Test and Tuning Important to focus that we did NOT incorporate the MTK in our DB2 compiler…. Instead, we closed the features gap between Oracle and DB2 by incorporating the missing features, which had to be emulate when translating applications to DB2. Features like: implicit casting, default parameters, anchored data types, etc.
Note the manual steps are not necessary anymore. Since DB2 accepts PL/SQL statements, there will be no need to go through the application source code and modifying the SQL statements from Oracle syntax to DB2 syntax. 2 advantages: #1 no modifications necessary to the application itself; #2 no need to maintain 2 sets of SQL code (simplify development efforts)
* Of course, just like in older DB2 versions, there is still need for some code changes, eg in a Java application, the need to change connection strings and the JDBC driver in use, for instance.
Important to focus that we did NOT incorporate the MTK in our DB2 compiler…. Instead, we closed the features gap between Oracle and DB2 by incorporating the missing features, which had to be emulate when translating applications to DB2. Features like: implicit casting, default parameters, anchored data types, etc.
Note the manual steps are not necessary anymore. Since DB2 accepts PL/SQL statements, there will be no need to go through the application source code and modifying the SQL statements from Oracle syntax to DB2 syntax. 2 advantages: #1 no modifications necessary to the application itself; #2 no need to maintain 2 sets of SQL code (simplify development efforts)
* Of course, just like in older DB2 versions, there is still need for some code changes, eg in a Java application, the need to change connection strings and the JDBC driver in use, for instance.
59. Application Migration Evaluation Originally built by Extreme Blue students.
Procedural SQL in functions, procedures, and triggers, ddl for tables, types, views, etc.
We use the cross compiler from EDB to examine source files.
The knowledgebase examines the cross compiler output for unsupported features and an html report is built. We only know what we learned from the development team and by performing the analysis manually for many ISV’s.
The report provides information about the unsupported feature, the line number it was found to facilitate migration efforts.
Originally built by Extreme Blue students.
Procedural SQL in functions, procedures, and triggers, ddl for tables, types, views, etc.
We use the cross compiler from EDB to examine source files.
The knowledgebase examines the cross compiler output for unsupported features and an html report is built. We only know what we learned from the development team and by performing the analysis manually for many ISV’s.
The report provides information about the unsupported feature, the line number it was found to facilitate migration efforts.
60. Easily Import Other Vendor’s Schemas Easily map schemas and data types from one database to another
Simple drag-and-drop operation
Automatically map schemas and data types
Easy for developers to quickly start using DB2
61. Performance: Stability Plan lock down
Focus on static plan lock down cooperatively between DB2 and Data Studio
Data Studio will provide mechanism to capture dynamic workload and convert to static
ALTER PACKAGE statement to specify ACCESS PLAN REUSE and OPTIMIZATION PROFILE
BIND/REBIND statement to specify ACCESS PLAN REUSE
Optimizer Guideline Enhancements
New guidelines
Dynamic plan lockdown using Data Studio
Tooling focus to capture, modify and install profiles for queries
For BIND
The default is NO.
If YES is specified and there is an existing package, then the query compiler will attempt to reuse the access plan for every statement that can be matched with a statement in the new bind file. For a statement to match the statement text must be identical and the section number (SECTNO column in SYSCAT.STATEMENTS) for the statement in the existing package must match what the section number will be for the statement in the new package.
For REBIND
The default is the value used on the previous BIND or REBIND command or ALTER PACKAGE statement. (The value can be determined by examining the APREUSE column for the package in SYSCAT.PACKAGES – ‘N’ for NO, “Y’ for YES).For BIND
The default is NO.
If YES is specified and there is an existing package, then the query compiler will attempt to reuse the access plan for every statement that can be matched with a statement in the new bind file. For a statement to match the statement text must be identical and the section number (SECTNO column in SYSCAT.STATEMENTS) for the statement in the existing package must match what the section number will be for the statement in the new package.
For REBIND
The default is the value used on the previous BIND or REBIND command or ALTER PACKAGE statement. (The value can be determined by examining the APREUSE column for the package in SYSCAT.PACKAGES – ‘N’ for NO, “Y’ for YES).
62. Where is my DB application spending its time? The main problem is that our monitoring tools are only looking at the activity on the data server itself.
However, there are many potential points of failure, which we do not monitor at all. It is extremely difficult for the DBA to preclude that it is not a data server problem.
In today‘s environment, applications are no longer running exclusively on the data server system. The network between data server and application is sometimes a problem. Also some applications have introduced techniques such as connection pooling and connection concentrator, which can have a significant impact on the experienced response time – even if it is not caused by the data server itself. And of course we also have to address problems which are caused by the applications business logic, or by overloaded client systems.
Therefore, we really need a solution which shows us all contributors to the response time the application is experiencing.The main problem is that our monitoring tools are only looking at the activity on the data server itself.
However, there are many potential points of failure, which we do not monitor at all. It is extremely difficult for the DBA to preclude that it is not a data server problem.
In today‘s environment, applications are no longer running exclusively on the data server system. The network between data server and application is sometimes a problem. Also some applications have introduced techniques such as connection pooling and connection concentrator, which can have a significant impact on the experienced response time – even if it is not caused by the data server itself. And of course we also have to address problems which are caused by the applications business logic, or by overloaded client systems.
Therefore, we really need a solution which shows us all contributors to the response time the application is experiencing.
63. Moving away from System Monitor Begin to move away from system monitor and snapshot technology for database monitoring
Moving towards SQL access direct to internal memory
Continuing the trend of WLM table functions in DB2 9.5
New, parallel monitoring infrastructure introduced which is independent of system monitor infrastructure
i.e. not connected to existing system monitor infrastructure such as monitor switches
Aim is to replace most commonly used database snapshot mechanisms over time
Only a few will be explicitly deprecated in Cobra but alternatives will be provided
Snapshot still needed in future for instance level information
64. “Time Spent” Metrics (example)
65. “Component Time” Metrics (example)
66. Monitoring and Problem Determination Levels of Monitoring
System Level Monitoring
Effort expended by DB2 to process application requests
Bufferpool, Commit, Compilation, Log, Package cache, etc.
Aggregation by Service Subclass, Connection, Workload Definition, Unit of Work
Example: Bufferpool Physical Read Wait Time
Activity Level Monitoring
Subset of metrics from above that measure statement level execution
Aggregation by section in package cache
Database Object Monitoring
Operations performed on particular database objects.
Table, index, buffer pool, tablespace, and container
Most metrics are counters; Example: # of table scans for Table A
Can drill down (e.g. buffer pool => tablespace => container OR table)
67. Peace of Mind with World Class Audit and Security Meet your compliance and governance needs
Understand risks and security issues
Auditing redesigned for fast performance
Easily find out Who, What, Where, When, and How you may have a breach
Reduce risk of unauthorized access
Precise Control
Encryption
Label-Based Access Control (LBAC) 44% of IT cited authorized access as key security challenge
? Computing Technology Industry Association (CompTIA) study
Top challenge for 43% of CFOs is governance, controls, and risk management
? IBM CFO Survey: Current state & future direction, IBM Business Consulting Services
Role membership is assigned to a user or group and role privileges are available for use in any and all activities under taken by the connection user.
The existence of roles will allow for SQL objects like views, packages, or triggers to be created by users who acquire the needed privileges through roles assigned to them.
Advantages of database roles
Simplification of the administration of privileges and authorities in the database
Roles are managed inside the database and, unlike groups, are considered by DB2 when creating views, triggers, MQTs, static SQL and SQL routines
Database roles have also been introduced in DB2 9.5. Role membership is assigned to a user or group and role privileges are available for use in any and all activities undertaken by the connection user. The existence of roles will allow for SQL objects like views, packages, or triggers to be created by users who acquire the needed privileges through roles assigned to them and will simplify overall security administration. A key point is that DBAs can now manage DB2 security using database roles without needing ROOT authority for the OS Groups.
____
Effective and Efficient Auditing
DB2 gives you control over critical information of who, when and from where are people accessing DB2
Who: tracking of all connections and authorizations
What: statement text and DDL tracking
Where: application id and TCP/IP of originating request
When: timestamp capture of key events
How: record all authorization checks for database actions
Why: that’s for the auditors to find out…
Identity Assertion
Easily group and track your users through an N-tier architecture
Trusted Context establishes a trust relationship between the database and an external entity such as an application server
This slide describes some of the enhancements made to DB2 9.5 that help it secure your data. The new DB2 9.5 audit facility efficiently and effectively produces all the audit records based on the audit configuration specified to control critical information of who, when, and from where people are accessing DB2. The slide shows examples of what can be audited. The audit facility has also been redesigned to focus on performance and fine grained auditing (FGA). The records generated from this facility are kept in an audit log file. The analysis of these records can reveal usage patterns which would identify system misuse. The audit facility provides the ability to audit at both the instance and the individual database level, independently recording all instance and database level activities with separate logs for each. The security administrator (SECADM) determines what should be audited by creating audit policies and then applying these audit policies at a very granular level to whatever database object they are interested in monitoring.
The new Identify Assertion capability provides a means whereby the end-user identity in a three-tier environment can be easily and efficiently propagated to the database server. It also introduces the concept of a trusted context between a database server and a specific application tier. Trusted context solves two key security problems: (1) the use of a common user ID by application servers when accessing the database and (2) the lack of control on when privileges can be applied to a connected user. Identity Assertion can also be used to identify and prevent users from coming in from certain locations (for example, a specific IP address).
____
Label Based Access Control (LBAC)
A “label” is associated with both users and data objects
Rules for comparing users and data labels allows access controls to be applied at the row and column level
Labels may consist of multiple components
Hierarchical, group, or tree types
Row labels appear as a single additional column in a protected table, regardless of the number of label components
User labels are granted by a security administrator
DB2 also has Label Based Access Control (LBAC) which was introduced in DB2 9. Label based access control is powerful form of content based authorization. It is a flexible implementation of Mandatory access control (MAC) and similar to other MAC implementations, is based on the use of security labels. These security labels determines access to all data within an LBAC protected table, and the database manager enforces strictly enforces this.
LBAC is an additional layer of security built directly inside the database management system. It complements – not replaces – the other access control mechanisms of the database ( authentication, authorization, etc.). It provides the ability to granularly control access at the individual row and column level, and to have the system enforce a model where no one – not even the DBA – has any inherent privileges to access the data inside the database.
44% of IT cited authorized access as key security challenge
? Computing Technology Industry Association (CompTIA) study
Top challenge for 43% of CFOs is governance, controls, and risk management
? IBM CFO Survey: Current state & future direction, IBM Business Consulting Services
Role membership is assigned to a user or group and role privileges are available for use in any and all activities under taken by the connection user.
The existence of roles will allow for SQL objects like views, packages, or triggers to be created by users who acquire the needed privileges through roles assigned to them.
Advantages of database roles
Simplification of the administration of privileges and authorities in the database
Roles are managed inside the database and, unlike groups, are considered by DB2 when creating views, triggers, MQTs, static SQL and SQL routines
Database roles have also been introduced in DB2 9.5. Role membership is assigned to a user or group and role privileges are available for use in any and all activities undertaken by the connection user. The existence of roles will allow for SQL objects like views, packages, or triggers to be created by users who acquire the needed privileges through roles assigned to them and will simplify overall security administration. A key point is that DBAs can now manage DB2 security using database roles without needing ROOT authority for the OS Groups.
____
Effective and Efficient Auditing
DB2 gives you control over critical information of who, when and from where are people accessing DB2
Who: tracking of all connections and authorizations
What: statement text and DDL tracking
Where: application id and TCP/IP of originating request
When: timestamp capture of key events
How: record all authorization checks for database actions
Why: that’s for the auditors to find out…
Identity Assertion
Easily group and track your users through an N-tier architecture
Trusted Context establishes a trust relationship between the database and an external entity such as an application server
This slide describes some of the enhancements made to DB2 9.5 that help it secure your data. The new DB2 9.5 audit facility efficiently and effectively produces all the audit records based on the audit configuration specified to control critical information of who, when, and from where people are accessing DB2. The slide shows examples of what can be audited. The audit facility has also been redesigned to focus on performance and fine grained auditing (FGA). The records generated from this facility are kept in an audit log file. The analysis of these records can reveal usage patterns which would identify system misuse. The audit facility provides the ability to audit at both the instance and the individual database level, independently recording all instance and database level activities with separate logs for each. The security administrator (SECADM) determines what should be audited by creating audit policies and then applying these audit policies at a very granular level to whatever database object they are interested in monitoring.
The new Identify Assertion capability provides a means whereby the end-user identity in a three-tier environment can be easily and efficiently propagated to the database server. It also introduces the concept of a trusted context between a database server and a specific application tier. Trusted context solves two key security problems: (1) the use of a common user ID by application servers when accessing the database and (2) the lack of control on when privileges can be applied to a connected user. Identity Assertion can also be used to identify and prevent users from coming in from certain locations (for example, a specific IP address).
____
Label Based Access Control (LBAC)
A “label” is associated with both users and data objects
Rules for comparing users and data labels allows access controls to be applied at the row and column level
Labels may consist of multiple components
Hierarchical, group, or tree types
Row labels appear as a single additional column in a protected table, regardless of the number of label components
User labels are granted by a security administrator
DB2 also has Label Based Access Control (LBAC) which was introduced in DB2 9. Label based access control is powerful form of content based authorization. It is a flexible implementation of Mandatory access control (MAC) and similar to other MAC implementations, is based on the use of security labels. These security labels determines access to all data within an LBAC protected table, and the database manager enforces strictly enforces this.
LBAC is an additional layer of security built directly inside the database management system. It complements – not replaces – the other access control mechanisms of the database ( authentication, authorization, etc.). It provides the ability to granularly control access at the individual row and column level, and to have the system enforce a model where no one – not even the DBA – has any inherent privileges to access the data inside the database.
68. Security - Separation of Duties Remove implicit DBADM from SYSADM
Remove ability to grant DBADM and SECADM from SYSADM
Allow SECADM to be granted to groups and roles
Allow SECADM to GRANT/REVOKE database and object auth
Setup up a DBADM that does not have the capability to grant and revoke privileges or access data
GRANT DBADM ON DATABASE WITHOUT ACCESSCTRL TO USER JOE
GRANT DBADM ON DATABASE WITHOUT DATAACCESS TO USER JOE
Remove secondary grants implicitly done when DBADM granted
BINDADD, CONNECT, CREATETAB, IMPLICIT_SCHEMA, LOAD,…
Introduce new authorities
EXPLAIN, DATAACCESS, ACCESSCTRL, SQLADM, WLMADM authorities
SQLADM authority can perform event monitor commands, holds EXPLAIN privilege, and can execute RUNSTATS
Implicit secondary grants for DBADM:
BINDADD
CONNECT
CREATETAB
CREATE_EXTERNAL_ROUTINE
CREATE_NOT_FENCED_ROUTINE
IMPLICIT_SCHEMA
QUIESCE_CONNECT
LOAD
Implicit secondary grants for DBADM:
BINDADD
CONNECT
CREATETAB
CREATE_EXTERNAL_ROUTINE
CREATE_NOT_FENCED_ROUTINE
IMPLICIT_SCHEMA
QUIESCE_CONNECT
LOAD
69. Robust Operation with High Availability Manage planned and unplanned outages
System maintenance
Local outages
Easy to add
No rewrite of your application
No specialized hardware
Setup in only minutes with graphical wizard
Systems can be in different locations
Ultra-fast (seconds) HADR was first introduced in Informix around 1994, and was incorporated into DB2 in V8.2. The concept is simple, as two servers are kept in sync by HADR by shipping log entries from the primary to the secondary. If the primary server fails, the standby server is ready to take over and can do so in a very short time – typically 10 to 15 seconds. The primary goals of HADR were to deliver very fast failover and to be easy to administer. We also wanted to ensure performance impact was minimal and that applications could fail over transparently. HADR also has advantages over other HA clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC). Automatic client re-route is part of DB2. The standby server location is stored in the system database directory, and loaded into the client upon establishing a database connection. If the primary server fails, a special SQLcode is returned which automatically results in the client establishing a connection to the standby server. Working together with HADR, this provides transparent failover to the standby server.
Add High Availability to your Legacy applications
Easy to configure High Availability and Disaster Recovery
Easy to add AFTER you write your application without any coding changes
Setup in minutes – Graphical Wizard
Allows ultra-fast failover (seconds)
Systems can be in different geographic locations
What do I get with HADR?
Ability to manage planned and unplanned outages
Seamless DB2 and system upgrades
Multiple levels of HA to match your customers needs
Automatic client re-routing
HADR was first introduced in Informix around 1994, and was incorporated into DB2 in V8.2. The concept is simple, as two servers are kept in sync by HADR by shipping log entries from the primary to the secondary. If the primary server fails, the standby server is ready to take over and can do so in a very short time – typically 10 to 15 seconds. The primary goals of HADR were to deliver very fast failover and to be easy to administer. We also wanted to ensure performance impact was minimal and that applications could fail over transparently. HADR also has advantages over other HA clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC). Automatic client re-route is part of DB2. The standby server location is stored in the system database directory, and loaded into the client upon establishing a database connection. If the primary server fails, a special SQLcode is returned which automatically results in the client establishing a connection to the standby server. Working together with HADR, this provides transparent failover to the standby server.
Add High Availability to your Legacy applications
Easy to configure High Availability and Disaster Recovery
Easy to add AFTER you write your application without any coding changes
Setup in minutes – Graphical Wizard
Allows ultra-fast failover (seconds)
Systems can be in different geographic locations
What do I get with HADR?
Ability to manage planned and unplanned outages
Seamless DB2 and system upgrades
Multiple levels of HA to match your customers needs
Automatic client re-routing
70. HADR Delivers Fast and Easy Failover HADR provides extremely high availability with failover times measured in seconds. It also has advantages over other clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC).
The primary server starts on the left. But if it fails, the primary moves over to the machine on the right automatically. If the left side machine is repaired, you can start HADR on that server and it will automatically resynchronize itself and become a standby server.HADR provides extremely high availability with failover times measured in seconds. It also has advantages over other clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC).
The primary server starts on the left. But if it fails, the primary moves over to the machine on the right automatically. If the left side machine is repaired, you can start HADR on that server and it will automatically resynchronize itself and become a standby server.
71. Even Better Resource Utilization – In Development HADR provides extremely high availability with failover times measured in seconds. It also has advantages over other clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC).
The primary server starts on the left. But if it fails, the primary moves over to the machine on the right automatically. If the left side machine is repaired, you can start HADR on that server and it will automatically resynchronize itself and become a standby server.HADR provides extremely high availability with failover times measured in seconds. It also has advantages over other clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC).
The primary server starts on the left. But if it fails, the primary moves over to the machine on the right automatically. If the left side machine is repaired, you can start HADR on that server and it will automatically resynchronize itself and become a standby server.
72. Gain Control with DB2 Workload Management Regain Control of your Complex Systems
Adapt to new business workloads without hurting your existing workloads
Let the business tell you what’s critical
Help ensure you meet your Service Level Agreements
DB2 can automatically alter resource allocations and priorities as needs change throughout the day
Set controls based on users and applications
Improve critical responsiveness of heavily used systems
DB2 Workload Management
Deeply embedded in the database engine DB2 9.5 workload manager (WLM) is a resource management and monitoring tool that is built right into the database engine. It is brand new with DB2 9.5. The primary customer benefits include tangible relief in the areas of CPU control, detection/control of rogue queries (limit excessive, unexpected resource consumption) and monitoring of work on the database. The new WLM architecture is also designed with integration of external WLM products such as AIX workload manager in mind, allowing DB2 to take advantage of their capabilities (on platforms where they exist) while improving the end-to-end workload management story. WLM is simple to administer, based on workload identification and then prioritization.
An excellent introductory article on DB2 9.5 workload management is available for download at http://www.ibm.com/developerworks/forums/servlet/JiveServlet/download/1116-166950-13965175-231542/Introduction%20to%20DB2%20workload%20management.pdf
These new workload management capabilities help to reduce the total cost of ownership by ensuring proper resource allocation and utilization, which can help meet service level agreements. DBA’s won’t have to worry about someone hogging CPU time with a monster query, and they can be sure that high priority work gets done first.
DB2 9.5 workload manager (WLM) is a resource management and monitoring tool that is built right into the database engine. It is brand new with DB2 9.5. The primary customer benefits include tangible relief in the areas of CPU control, detection/control of rogue queries (limit excessive, unexpected resource consumption) and monitoring of work on the database. The new WLM architecture is also designed with integration of external WLM products such as AIX workload manager in mind, allowing DB2 to take advantage of their capabilities (on platforms where they exist) while improving the end-to-end workload management story. WLM is simple to administer, based on workload identification and then prioritization.
An excellent introductory article on DB2 9.5 workload management is available for download at http://www.ibm.com/developerworks/forums/servlet/JiveServlet/download/1116-166950-13965175-231542/Introduction%20to%20DB2%20workload%20management.pdf
These new workload management capabilities help to reduce the total cost of ownership by ensuring proper resource allocation and utilization, which can help meet service level agreements. DBA’s won’t have to worry about someone hogging CPU time with a monster query, and they can be sure that high priority work gets done first.
73. Integrated Workload Management
Two thresholds for queing: 1) start queing 2) size of queue
Thresholds: temp, rows returned, elapsed time
Future: aggregate temp, rows read, cpu time, locks, etc log
Two thresholds for queing: 1) start queing 2) size of queue
Thresholds: temp, rows returned, elapsed time
Future: aggregate temp, rows read, cpu time, locks, etc log
74. Priority Aging Concept The basic setup is one where all incoming work comes into a service subclass defined as high priority (using agent priority). In this example, every activity gets to spend its first five seconds of CPU time in the high priority service class. Then the threshold kicks in and the activity is remapped to the medium priority service class action. And so on.
When work reaches the lowest priority service class, it could stay there till completion or your could add an additional threshold on the low priority class that moved the activity back up to the high priority class again after some other value of CPUTIME (e.g. 60 seconds) to start the whole cycle over again.
This is just one example. You can do the same thing using the ROWSREAD in Service Class threshold.
The advantages of this approach is that it is simpler configuration while still introducing a very natural prioritization approach which will bias itself in favour of short running queries.
The basic setup is one where all incoming work comes into a service subclass defined as high priority (using agent priority). In this example, every activity gets to spend its first five seconds of CPU time in the high priority service class. Then the threshold kicks in and the activity is remapped to the medium priority service class action. And so on.
When work reaches the lowest priority service class, it could stay there till completion or your could add an additional threshold on the low priority class that moved the activity back up to the high priority class again after some other value of CPUTIME (e.g. 60 seconds) to start the whole cycle over again.
This is just one example. You can do the same thing using the ROWSREAD in Service Class threshold.
The advantages of this approach is that it is simpler configuration while still introducing a very natural prioritization approach which will bias itself in favour of short running queries.
75. Increase Collaboration with Integrated Tools Integrated data management tools
IBM Infosphere Data Architect, IBM Data Studio, IBM Optim solutions
Cover all phases of data lifecycle
Integrated tools allows collaboration across roles
Business analyst, database architect, database administrator, application developer, etc.
Support all major database platforms
IBM DB2, Informix, Oracle Database, SQL Server, etc. IBM has a grand vision to create a single toolset that can manage every aspect of the data life cycle. This toolset would offer built-in compliance and easy integration with other IBM tools. The first step was taken in October 2007 with the release of Data Studio 1.1. The free download, the first incarnation of the vision, supported DB2 and Informix Dynamic Server and included entity relationship diagramming, an SQL builder, an XML editor, pureQuery for Java, security access tools, data management tools, and other features. Work is underway to broaden the Data Studio family to include all the tools needed in every phase of data's existence. To understand the benefits of the complete toolset vision, it helps to picture the data life cycle as a whole. As IBM sees it, data goes through five distinct stages during its life: design, development, deployment, management, and governance. In the design stage, business analysts and database architects create the data models that will be used to develop applications and databases that meet the organization's needs. IBM's Data Studio tools for this stage currently include logical modeling and entity relationship diagramming, with physical modeling capabilities coming soon. The next two stages, develop and deploy, are closely linked, and the tasks associated with both stages are frequently performed by people whose titles include the word "developer." Under IBM's framework, the development stage refers specifically to application development, often involving Java, .Net, PHP, Ruby, or COBOL programming. The deploy stage covers database development, usually in SQL or another query language. Manage encompasses the tasks usually performed by a DBA. Those include things like day-to-day administration, configuration, performance tuning, change management, backup and recovery, and so on. In the final stage, govern, security administrators take responsibility both for the security of the organization's data resources and ensuring that the organization complies with all relevant regulations. This is also the area that includes auditing, encryption, archiving, and, ultimately, data destruction once the retention period has elapsed. More components will be integrated into the various life-cycle stages IBM Data Studio manages in the coming year. Some will be components of the free download, others will be paid features. Here's what features are likely to be added in the near future. Rational Data Architect will become a fully integrated modeling component of Data Studio and will remain a paid feature. The Data Studio Administration Console, which provides monitoring capabilities, will be a component of the free offering and is currently available as a technology preview. Paid features for performance management, change management, query tuning, problem determination using pureQuery, and a test data generator are all likely additions in 2008.
IBM has a grand vision to create a single toolset that can manage every aspect of the data life cycle. This toolset would offer built-in compliance and easy integration with other IBM tools. The first step was taken in October 2007 with the release of Data Studio 1.1. The free download, the first incarnation of the vision, supported DB2 and Informix Dynamic Server and included entity relationship diagramming, an SQL builder, an XML editor, pureQuery for Java, security access tools, data management tools, and other features. Work is underway to broaden the Data Studio family to include all the tools needed in every phase of data's existence. To understand the benefits of the complete toolset vision, it helps to picture the data life cycle as a whole. As IBM sees it, data goes through five distinct stages during its life: design, development, deployment, management, and governance. In the design stage, business analysts and database architects create the data models that will be used to develop applications and databases that meet the organization's needs. IBM's Data Studio tools for this stage currently include logical modeling and entity relationship diagramming, with physical modeling capabilities coming soon. The next two stages, develop and deploy, are closely linked, and the tasks associated with both stages are frequently performed by people whose titles include the word "developer." Under IBM's framework, the development stage refers specifically to application development, often involving Java, .Net, PHP, Ruby, or COBOL programming. The deploy stage covers database development, usually in SQL or another query language. Manage encompasses the tasks usually performed by a DBA. Those include things like day-to-day administration, configuration, performance tuning, change management, backup and recovery, and so on. In the final stage, govern, security administrators take responsibility both for the security of the organization's data resources and ensuring that the organization complies with all relevant regulations. This is also the area that includes auditing, encryption, archiving, and, ultimately, data destruction once the retention period has elapsed. More components will be integrated into the various life-cycle stages IBM Data Studio manages in the coming year. Some will be components of the free download, others will be paid features. Here's what features are likely to be added in the near future. Rational Data Architect will become a fully integrated modeling component of Data Studio and will remain a paid feature. The Data Studio Administration Console, which provides monitoring capabilities, will be a component of the free offering and is currently available as a technology preview. Paid features for performance management, change management, query tuning, problem determination using pureQuery, and a test data generator are all likely additions in 2008.
76. DB2 Saves 37% for Retail Grocery retailer; 600+ stores
$10 billion sales; 50,000+ employees
Analytical data warehouse system
Databases 2,052.9 1,4357
Disk systems 1,870.3 987.8
Tape systems 358.0 269.9
Database servers 1,020.8 532.1
DBA personnel 876.7 640.4
TOTAL ($000) 6,178.7 3,542.1Databases 2,052.9 1,4357
Disk systems 1,870.3 987.8
Tape systems 358.0 269.9
Database servers 1,020.8 532.1
DBA personnel 876.7 640.4
TOTAL ($000) 6,178.7 3,542.1
77. DB2 Saves 34% for Telecoms Wireless & Internet access services; 2 million subscribers
$1.5 billion sales; 5,000+ employees
Billing, CRM, operational systems & data store Databases 4,807.2 2,352.2
Disk systems 6,182.2 3,898.9
Tape systems 487.8 243.9
Database servers 2,146.4 1,589.5
DBA personnel 1,315.1 960.5
TOTAL ($000) 14,938.7 9,045.0
Databases 4,807.2 2,352.2
Disk systems 6,182.2 3,898.9
Tape systems 487.8 243.9
Database servers 2,146.4 1,589.5
DBA personnel 1,315.1 960.5
TOTAL ($000) 14,938.7 9,045.0
78. DB2 Saves 38% for Financial Services Diversified retail bank; 200+ branches; 5+ million accounts
$30 billion assets; 7,500 employees
Industry standards-based (XML) transactional system Databases 2,574.8 1,341.1
Disk systems 3,792.8 2,174.0
Tape systems 282.3 151.7
Database servers 1,128.7 740.7
DBA personnel 657.5 533.6
TOTAL ($000) 8,436.1 4,941.1Databases 2,574.8 1,341.1
Disk systems 3,792.8 2,174.0
Tape systems 282.3 151.7
Database servers 1,128.7 740.7
DBA personnel 657.5 533.6
TOTAL ($000) 8,436.1 4,941.1
79. Attend IDUG 2009 – Europe (5-9 October, Rome, Italy)
Increase your skills with in-depth technical education
more than 100 technical sessions
Network with hundreds of DB2 professionals, experts, IBM executives and other top industry representatives
80. IBM DB2: Lower Cost Made Easy