1 / 79

DB2 9.7 Technology Update

lot
Download Presentation

DB2 9.7 Technology Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. DB2 9.7 Technology Update Salvatore Vella, VP, Development

    2. Sports Examples – More Wins Per Dollar

    3. The Information Challenge

    4. Information On Demand Unlocking the Business Value of Information for Competitive Advantage

    5. Information On Demand End-to-End Capabilities

    6. IBM DB2: Lower Cost Made Easy Additional accolades: Software & Information Industry Association—2007 CODiE Award Winner, Best Database Management Solution Rapid development – also includes easier application porting (more of this is coming in 2009) Optimized for SOA: pureXML™ for seamless and efficient integration of XML and relational data Improved performance for dynamic warehousing and OLTP requirements Industry’s Highest Level of Scalability & Availability: Unmatched availability for superior information access Near-linear scalability for mission critical SAP, ERP and CRM requirements Manage Risk and Streamline Compliance: Unparalleled security capabilities for greater control and business flexibility Increased auditing capability and efficiency Additional accolades: Software & Information Industry Association—2007 CODiE Award Winner, Best Database Management Solution Rapid development – also includes easier application porting (more of this is coming in 2009) Optimized for SOA: pureXML™ for seamless and efficient integration of XML and relational data Improved performance for dynamic warehousing and OLTP requirements Industry’s Highest Level of Scalability & Availability: Unmatched availability for superior information access Near-linear scalability for mission critical SAP, ERP and CRM requirements Manage Risk and Streamline Compliance: Unparalleled security capabilities for greater control and business flexibility Increased auditing capability and efficiency

    7. Themes Lowest Cost Best performance with most efficient utilization of available resources Ongoing Flexibility Allow for continuous and flexible change management Service Level Confidence Expand your critical workloads confidently and cost effectively XML Insight Harness the business value of XML Break Free with DB2 Use the database server that gives you the freedom to choose Balanced Warehouse Create table ready warehouse appliance with proven high performance

    8. Lower Storage Costs with Deep Compression Best in industry Minimize storage costs Improve performance Easy to implement DB2 9.5 has industry leading data compression technology that is transparent to DBAs, users, and applications. We call it “Deep Compression”. DB2 automatically builds a compression dictionary, and uses it so that the most frequently repeated values in data rows are compressed. These repeating values can span columns, which makes the compression even more effective. The way it works is that these repeating patterns within the table are replaced by a smaller symbol, and the real values are then stored in the compression dictionary. In addition to reducing disk storage space, it can also lead to disk I/O savings since more rows are accessed per I/O operation which in turn leads to faster query performance. Deep Compression can also enable more data to be cached in the buffer pool, thereby leading to increased bufferpool hit ratios which again can improve performance. Customers such as SunTrust have reported compression rates over 80% which as you can imagine resulted in significant storage cost savings! In the SunTrust case the initial savings were more than $2 million which is a great example of how DB2 can lower your total cost of ownership! DB2 9.5 has industry leading data compression technology that is transparent to DBAs, users, and applications. We call it “Deep Compression”. DB2 automatically builds a compression dictionary, and uses it so that the most frequently repeated values in data rows are compressed. These repeating values can span columns, which makes the compression even more effective. The way it works is that these repeating patterns within the table are replaced by a smaller symbol, and the real values are then stored in the compression dictionary. In addition to reducing disk storage space, it can also lead to disk I/O savings since more rows are accessed per I/O operation which in turn leads to faster query performance. Deep Compression can also enable more data to be cached in the buffer pool, thereby leading to increased bufferpool hit ratios which again can improve performance. Customers such as SunTrust have reported compression rates over 80% which as you can imagine resulted in significant storage cost savings! In the SunTrust case the initial savings were more than $2 million which is a great example of how DB2 can lower your total cost of ownership!

    9. Automated Compression for All Form of Ingest Not only does DB2 compression save you storage space and deliver better performance, with DB2 9.5 you also get a completely automated solution. All that is required from a database administrator is to indicate to DB2 which tables in the database should be considered “compression eligible” and DB2 will do the rest. As data is added to these tables (either through insert, update, load, or import operations), DB2 will monitor the size of the table. If and when a table exceeds a given threshold size (as determined by DB2) a compression dictionary will be created automatically on the fly and any new data added to the table will be automatically compressed.Not only does DB2 compression save you storage space and deliver better performance, with DB2 9.5 you also get a completely automated solution. All that is required from a database administrator is to indicate to DB2 which tables in the database should be considered “compression eligible” and DB2 will do the rest. As data is added to these tables (either through insert, update, load, or import operations), DB2 will monitor the size of the table. If and when a table exceeds a given threshold size (as determined by DB2) a compression dictionary will be created automatically on the fly and any new data added to the table will be automatically compressed.

    10. Features currently in development: Multiple algorithms for automatic index compression Automatic compression for temporary tables Intelligent compression of large objects and XML Of course DB2 is not standing still in the area of storage savings either. Storage costs continue to be a concern for customers and so the upcoming release of DB2 codenamed Cobra will deliver even more storage savings. Two examples of this are in the area of index compression and TEMP table compression. In DB2 Cobra, you will not only be able to compress tables but DB2 will also provide automatic compression for indexes as well. In fact there are a number of ways to compress indexes based on the type of index and the data distribution within that index. DB2 will be able to examine the index data and automatically chose from a set of compression algorithms that provides an optimal compression for the index. Similarly DB2 will automatically compress temporary data. This can be a huge savings for data warehouse applications where large sorts and large intermediate result sets can consume a significant amount of storage. In DB2 Cobra, temporary tables used for sorting and intermediate query results will automatically be compressed if they consume a significant amount of storage thus saving you even more on your storage costs.Of course DB2 is not standing still in the area of storage savings either. Storage costs continue to be a concern for customers and so the upcoming release of DB2 codenamed Cobra will deliver even more storage savings. Two examples of this are in the area of index compression and TEMP table compression. In DB2 Cobra, you will not only be able to compress tables but DB2 will also provide automatic compression for indexes as well. In fact there are a number of ways to compress indexes based on the type of index and the data distribution within that index. DB2 will be able to examine the index data and automatically chose from a set of compression algorithms that provides an optimal compression for the index. Similarly DB2 will automatically compress temporary data. This can be a huge savings for data warehouse applications where large sorts and large intermediate result sets can consume a significant amount of storage. In DB2 Cobra, temporary tables used for sorting and intermediate query results will automatically be compressed if they consume a significant amount of storage thus saving you even more on your storage costs.

    11. Index Compression Algorithms implemented by the Database Engine (under-the-covers): RID List Compression, Prefix Compression, Variable Slot Directory Activated: When row compression is activated on a table CREATE INDEX with the new “COMPRESS YES” option ALTER INDEX COMPRESS [YES|NO] statement, followed by an index reorg Savings ADMIN_GET_INDEX_COMPRESS_INFO to estimate compression savings for uncompressed index COMPRESS and PCTPAGESSAVED in the SYSINDEXES catalog show if an index is defined as compressed and the percentage saved respectively Typical Savings: 35% - 55% Plans for a compression estimator to be used on pre-Db2 9.7 databases DB2 will use common prefix compression and RID list compression for index leaf pages, which represent the majority of the index space. A major benefit of compressing leaf pages is that there are less frequent page splits and smaller index levels. Because the number of index levels is reduced, an index search starting from the root page requires fewer I/O corresponding to the number of index levels. Similarly an index scan requires fewer I/O corresponding to the total number of leaf pages. Furthermore, more index keys can be cached in the buffer pool. The buffer pool hit ratios increase and the index page physical reads and writes decrease. On the other hand, since the index keys are now compressed, we will need extra CPU to uncompress the index keys back to the uncompressed format before returning the result back to the users, or to compress the index key to the compressed format before updating an index page.DB2 will use common prefix compression and RID list compression for index leaf pages, which represent the majority of the index space. A major benefit of compressing leaf pages is that there are less frequent page splits and smaller index levels. Because the number of index levels is reduced, an index search starting from the root page requires fewer I/O corresponding to the number of index levels. Similarly an index scan requires fewer I/O corresponding to the total number of leaf pages. Furthermore, more index keys can be cached in the buffer pool. The buffer pool hit ratios increase and the index page physical reads and writes decrease. On the other hand, since the index keys are now compressed, we will need extra CPU to uncompress the index keys back to the uncompressed format before returning the result back to the users, or to compress the index key to the compressed format before updating an index page.

    12. Index Compression: Measurements

    13. Compression Savings from SAP supplied tables

    14. Temp Table Compression Compression of temporary tables aims to: Reduce the amount of temporary disk space required Have no performance penalty as a result of the extra processing required for row compression. If there is an increase in query performance, that is an added side benefit Typical up to 40% reduction Applicable to User temporary tables and System temps (DGTT/CGTT) Sorts, HSJN, MGJN,NLJN, utilities, … If Deep Compression is licensed, then temporary tables will be compressed by default There is no additional action required by the user in order to use it. DB2 will evaluate the query and apply compression where appropriate. db2pd will report on temp tablespace usage

    15. Temp Compression: Measurements Test Details Analysis was done on 11/99 TPC-DS Queries affected by temp compression TPC-DS is the complex query workload with most temp activityTest Details Analysis was done on 11/99 TPC-DS Queries affected by temp compression TPC-DS is the complex query workload with most temp activity

    16. XML (XDA) Compression In addition to XML Inlining Approach will be very similar to that of Table Compression Static dictionary approach Classic/’Offline’ reorg table based ADC applicable No online dictionary support via DB2 INSPECT Enablement is via the table COMPRESS attribute i.e. piggyback data table setting – no specific compress setting for XML alone

    17. Inlining If a table possesses LOB (or XML) data that can be inlined, there are considerable benefits with respect to performance and storage Performance Whenever LOBs are inserted/retrieved, a disk I/O cost is incurred each time since this data is not buffered (unlike with base table data) When inlined, this I/O cost is reduced since this data get buffered along with the base table data they are inlined with Storage Storage allocated to the storage object is reduced by inlining XML/LOB data in the base table (though base table storage increases) Inlining small XML/LOBs can result in a noticeable decrease in net total storage since the decrease in storage size is greater than the increase in base table storage size XML/LOBs inlined within the base table can be compressed Why inline? There are considerable benefits with respect to performance and storage (LOB==XML) Unlike base table data, LOB data is not buffered This translates to going to disk each time the LOB object is accessed Insert LOB data, write to disk Query for LOB data, read from disk For inlined LOBs, the LOBs essentially become a part of the base table data and therefore become subject to buffering, reducing the I/O costs Storage-wise, the amount of space allocated for the LOB object decreases, though the storage for the base table increases by a similar amount However, depending on the size of your LOBs, the amount of storage decrease in the LOB object may be greater than the increase in base table storage size This would result in a net total storage savings Likely to happen for a table with small LOBs The likelihood of this behaviour diminishes with larger and larger LOBs For inlined LOBs, the data essentially becomes base table data and can therefore be compressed for storage savings (potentially – depends on the nature of the LOB data)Why inline? There are considerable benefits with respect to performance and storage (LOB==XML) Unlike base table data, LOB data is not buffered This translates to going to disk each time the LOB object is accessed Insert LOB data, write to disk Query for LOB data, read from disk For inlined LOBs, the LOBs essentially become a part of the base table data and therefore become subject to buffering, reducing the I/O costs Storage-wise, the amount of space allocated for the LOB object decreases, though the storage for the base table increases by a similar amount However, depending on the size of your LOBs, the amount of storage decrease in the LOB object may be greater than the increase in base table storage size This would result in a net total storage savings Likely to happen for a table with small LOBs The likelihood of this behaviour diminishes with larger and larger LOBs For inlined LOBs, the data essentially becomes base table data and can therefore be compressed for storage savings (potentially – depends on the nature of the LOB data)

    18. Index/TEMP Compression vs. Oracle and Microsoft Oracle and Microsoft support only prefix compression With Oracle you have to specify the number of columns to prefix CREATE INDEX ix1 ON t1(c1,c2,c3) COMPRESS 2; DB2 is the only vendor to do RID list compression DB2 is the only vendor to offer Temp Table compression DB2 is the only vendor to deliver compression on inlined XML and LOB data

    20. Schema Evolution: Example – updating a view definition Scenario: A table is decomposed into two tables A view (V1) must be updated accordingly: DROP VIEW V1 RESTRICT CREATE VIEW V1 AS (SELECT … FROM T1, T2 …) Another view (V2) references V1 Problem: The database will prevent the drop of V1 because it would make V2 invalid, even though the second statement would make V2 valid again.

    21. Schema Evolution: The solution Relax the object dependency model Allow changes that affect dependent objects to succeed Automatically revalidate dependent objects ALTER TABLE, ALTER COLUMN, DROP COLUMN, RENAME COLUMN CREATE OR REPLACE ALIAS, FUNCTION, NICKNAME, PROCEDURE, TRIGGER, VARIABLE, VIEW DROP FUNCTION, NICKNAME, PROCEDURE, SEQUENCE, TABLE, TRIGGER, TYPE, VARIABLE, VIEW, TABLE Extend to support RENAME COLUMN Support CREATE OR REPLACE syntax for views, functions, triggers, etc. Allow additional data type changes via ALTER COLUMN Between any types SQL runtime can cast

    22. Online Table Move ADMIN_MOVE_TABLE Move data in an existing table to a new table object Source table remains online: both read (select) and write (IUD) operations Final phase renames the tables: target table will have the original table name Use Cases Online table compression Online REORG or Redistribute Online conversion to LARGE tablespaces Move data/index/long data to new/different tablespaces Support for limited schema evolution: Add or remove columns, change column datatypes Add/change MDC dimensions, range partitioning or partitioning key

    23. Local (aka partitioned) indexes Can have both RID based local index as well as local MDC block indexCan have both RID based local index as well as local MDC block index

    24. Maximize availability during maintenance DPn = Data Partition nDPn = Data Partition n

    25. Maximize availability during maintenance DPn = Data Partition nDPn = Data Partition n

    26. Transportable Schema – In Development Efficient schema movement between databases Transport schema from a backup image Performance objective – 100 GB in under 20 minutes Restore will now do multiple operations Restore the syscatspace and specified table spaces from the backup image Roll them forward to a consistency point Validate the schemas specified Transfer ownership of the specified table spaces to the target DB Recreate the schema in the targetDB A small staging database consisting of the system catalog table space from the backup image will be created in the DFTDBPATH and dropped once the restore utility completes. This staging database is required to extract the DDL, to regenerate the objects that reside in the table space being transported. If an online backup image is used then the staging database will be rolled forward to the end of the backup. If an offline backup image is used then no rollforward processing will be performed The table space / schema set being transported must consist of self contained transport sets or the restore will fail. Once the schemas have been validated, the relevant system catalog entries will be created in the target DB which describes the objects in the table spaces that were transported. Upon completion of the schema recreation, the target DB will take ownership of the physical table space containers. STAGE IN Specifies the name of a temporary database to be used for the rollforward recovery of the backup image being transported from. If this parameter is specified the temporary staging database will not be automatically dropped, but is no longer required after the transport has completed and can be dropped by the DBA. If this parameter is not specified: DB2 will automatically generate a unique database name of the form 'SYSTGxxx' where 'xxx' is an integer value. A database with this name will be created and used during the transport operation. The temporary staging database will be dropped upon completion of the transport operation. A small staging database consisting of the system catalog table space from the backup image will be created in the DFTDBPATH and dropped once the restore utility completes. This staging database is required to extract the DDL, to regenerate the objects that reside in the table space being transported. If an online backup image is used then the staging database will be rolled forward to the end of the backup. If an offline backup image is used then no rollforward processing will be performed The table space / schema set being transported must consist of self contained transport sets or the restore will fail. Once the schemas have been validated, the relevant system catalog entries will be created in the target DB which describes the objects in the table spaces that were transported. Upon completion of the schema recreation, the target DB will take ownership of the physical table space containers. STAGE IN Specifies the name of a temporary database to be used for the rollforward recovery of the backup image being transported from. If this parameter is specified the temporary staging database will not be automatically dropped, but is no longer required after the transport has completed and can be dropped by the DBA. If this parameter is not specified: DB2 will automatically generate a unique database name of the form 'SYSTGxxx' where 'xxx' is an integer value. A database with this name will be created and used during the transport operation. The temporary staging database will be dropped upon completion of the transport operation.

    27. Lightning Fast for Transactions www.tpc.org as of 12/12/08 DB2 currently is 50% faster than Oracle on the tpc-c transcational benchmark http://www.tpc.org/tpcc/results/tpcc_perf_results.aspwww.tpc.org as of 12/12/08 DB2 currently is 50% faster than Oracle on the tpc-c transcational benchmark http://www.tpc.org/tpcc/results/tpcc_perf_results.asp

    28. Intel x86 World Record – 1.2M TPM-C

    29. Lightning Fast for Business Intelligence www.tpc.org as of 12/12/08 DB2 currently is 50% faster than Oracle on the tpc-c transcational benchmark http://www.tpc.org/tpcc/results/tpcc_perf_results.aspwww.tpc.org as of 12/12/08 DB2 currently is 50% faster than Oracle on the tpc-c transcational benchmark http://www.tpc.org/tpcc/results/tpcc_perf_results.asp

    30. XML Transaction Processing DB2 9.7 on Intel Xeon 5570 delivers Outstanding out-of-the-box performance TPoX benchmark results Default db2 registry, db or dbm configuration changes Excellent performance scalability of 1.78x from Xeon 5400 to Xeon 5500 series (2-socket, quad-core) Performance per Watt improves by 1.52x Hardware cconfiguration: Harpertown (Intel Xeon 5400, 3.16GHz, 2 sockets, eight cores, 32GB RAM) Nehalem (Intel Xeon 5500, 2.93 GHz, 2 socket, eight cores, 48GB RAM, hyperthreading enabled, turbo-mode enabled) Both used the following storage configuration: 1 Internal Drive + 120 Disk RAID0 Setup contains the DB2 database, 15 Disk RAID0 for logging 30 Disk RAID0 for raw XML data The software configurations was: DB2 9.7, s090430 SLES 10 SP2 (2.6.16.60-0.21-smp) TPoX 1.2 MX 10000 SF (1TB) Default DB2 environment variables Open Source Benchmark: TPoX (Transaction Processing over XML Data) http://tpox.sourceforge.net/ 1TB raw data, 360M XML Documents 70% Queries, 30% Insert/Update/Delete Intel® Xeon® Processor 5400 Series and Intel® Xeon® Processor 5500 Series are trademarks of Intel Corporation in the U.S. and other countries. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests.  Any difference in system hardware or software design or configuration may affect actual performance.  Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing.  For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations at http://www.intel.com/performance/resources/limits.htm. Hardware cconfiguration: Harpertown (Intel Xeon 5400, 3.16GHz, 2 sockets, eight cores, 32GB RAM) Nehalem (Intel Xeon 5500, 2.93 GHz, 2 socket, eight cores, 48GB RAM, hyperthreading enabled, turbo-mode enabled) Both used the following storage configuration: 1 Internal Drive + 120 Disk RAID0 Setup contains the DB2 database, 15 Disk RAID0 for logging 30 Disk RAID0 for raw XML data The software configurations was: DB2 9.7, s090430 SLES 10 SP2 (2.6.16.60-0.21-smp) TPoX 1.2 MX 10000 SF (1TB) Default DB2 environment variables Open Source Benchmark: TPoX (Transaction Processing over XML Data) http://tpox.sourceforge.net/ 1TB raw data, 360M XML Documents 70% Queries, 30% Insert/Update/Delete Intel® Xeon® Processor 5400 Series and Intel® Xeon® Processor 5500 Series are trademarks of Intel Corporation in the U.S. and other countries. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests.  Any difference in system hardware or software design or configuration may affect actual performance.  Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing.  For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations at http://www.intel.com/performance/resources/limits.htm.

    31. DB2 9.7 Delivers Even Faster OLTP with Statement Concentrator DB2 9.7 Optionally replace literals with parameter markers Increases section sharing and reduces compilation Reduces number of statements to be compiled

    32. Multiple Scanners pre DB2 9.7

    33. Multiple Scanners with DB2 9.7

    34. Scan Sharing Performance Test TPCH Q1 : CPU Intensive, Slow Query On Lineitem Table Using A Table Scan TPCH Q6 : IO Intensive, Fast Query On Lineitem Table Using A Table Scan Test Scenario : Queries executed in parallel in the following sequence Results : 34% Improvement In End to End Timing

    35. Lower Storage Costs with Deep Compression Best in industry Minimize storage costs Improve performance Easy to implement DB2 9.5 has industry leading data compression technology that is transparent to DBAs, users, and applications. We call it “Deep Compression”. DB2 automatically builds a compression dictionary, and uses it so that the most frequently repeated values in data rows are compressed. These repeating values can span columns, which makes the compression even more effective. The way it works is that these repeating patterns within the table are replaced by a smaller symbol, and the real values are then stored in the compression dictionary. In addition to reducing disk storage space, it can also lead to disk I/O savings since more rows are accessed per I/O operation which in turn leads to faster query performance. Deep Compression can also enable more data to be cached in the buffer pool, thereby leading to increased bufferpool hit ratios which again can improve performance. Customers such as SunTrust have reported compression rates over 80% which as you can imagine resulted in significant storage cost savings! In the SunTrust case the initial savings were more than $2 million which is a great example of how DB2 can lower your total cost of ownership! DB2 9.5 has industry leading data compression technology that is transparent to DBAs, users, and applications. We call it “Deep Compression”. DB2 automatically builds a compression dictionary, and uses it so that the most frequently repeated values in data rows are compressed. These repeating values can span columns, which makes the compression even more effective. The way it works is that these repeating patterns within the table are replaced by a smaller symbol, and the real values are then stored in the compression dictionary. In addition to reducing disk storage space, it can also lead to disk I/O savings since more rows are accessed per I/O operation which in turn leads to faster query performance. Deep Compression can also enable more data to be cached in the buffer pool, thereby leading to increased bufferpool hit ratios which again can improve performance. Customers such as SunTrust have reported compression rates over 80% which as you can imagine resulted in significant storage cost savings! In the SunTrust case the initial savings were more than $2 million which is a great example of how DB2 can lower your total cost of ownership!

    36. Automated Compression for All Form of Ingest Not only does DB2 compression save you storage space and deliver better performance, with DB2 9.5 you also get a completely automated solution. All that is required from a database administrator is to indicate to DB2 which tables in the database should be considered “compression eligible” and DB2 will do the rest. As data is added to these tables (either through insert, update, load, or import operations), DB2 will monitor the size of the table. If and when a table exceeds a given threshold size (as determined by DB2) a compression dictionary will be created automatically on the fly and any new data added to the table will be automatically compressed.Not only does DB2 compression save you storage space and deliver better performance, with DB2 9.5 you also get a completely automated solution. All that is required from a database administrator is to indicate to DB2 which tables in the database should be considered “compression eligible” and DB2 will do the rest. As data is added to these tables (either through insert, update, load, or import operations), DB2 will monitor the size of the table. If and when a table exceeds a given threshold size (as determined by DB2) a compression dictionary will be created automatically on the fly and any new data added to the table will be automatically compressed.

    37. IBM InfoSphere™ Balanced Warehouse Pre-built, Pre-packaged Warehouse Appliance Offering Reduces Risk Standard Pre-tested Configuration Ongoing Supported Package Predictable performance

    38. InfoSphere Warehouse is fully certified and ready for production use of Solid State Drives (SSD) Solid State Drives enable break through IO performance for transactional and data warehousing environments DB2’s Deep Compression further enhances the applicability of SSD flash DB2 optimizes for SSD devices with its industry-leading cost-based optimizer for improved query performance InfoSphere Balanced Warehouse exploiting flash in the near future

    39. IBM InfoSphere WarehouseTM - Powered by DB2 Addressing the needs for Dynamic Warehousing InfoSphere Warehouse Design Studio – Development environment for warehousing solutions Physical data modeling, DB2 SQL-based warehouse construction, OLAP cube modeling, and Data mining modeling Reporting and Data Mining on the single version of the truth Modeling, Visualization, and Scoring Support for Discovery and Predictive mining, and Multiple mining methods New analytical algorithms and wizard based data preparation for modeling Cubing Services for OLAP High Performanance 64-bit ROLAP solution with unlimited scalability Enhanced cubing model and cube security Integrated MDX/XMLA provider Supports maintream MDX client tools. ex: Excel Cognos 8 integration SQL Warehousing Tool Integrated DB2-based environment for life cycle of warehouse building and maintenance activities Rapid deployment of data movement processes Unstructured Data Analysis Extract Structured data and correlate with Structured data to gain business insight - Reporting and data mining on a single version of the truth - Design Studio Mining Editor - From Design to embedded applications - - Seamless integration of data modeling, SQL, mining in a common Data Warehouse project - Data exploration components give you an overview on the content and quality of your data. - Interactive flow editor: Stepwise execution. All SQL operators can be reused in the mining context - Two types of mining in InfoSphere Warehouse - Discovery Mining - Finding patterns in data that may be used to guide decisions. - Predictive Mining- Using known results to create models that can predict value InfoSphere Warehouse mining methods - Associations, sequences, Clustering, Classification, regression, ... Modeling. The data mining process starts with historical data being gathered and put through a series of mathematical functions to derive business rules such as: "Customers buying Gouda cheese are likely to also buy mixed nuts 18% of the time." The business rules are collected together into a Model. A model can have a few rule or tens of thousands of rules. Visualization. The business rules must be analyzed and verified by a business analyst to ensure the rules are accurate. IBM offers a unique Visualization tool to assist analysts in evaluating the business rules. Scoring. The verified business rules can be applied to new data to determine the appropriate predicted outcome. For example, a new bank transaction enters the system and the fraud detection rules are applied against the data. The rules will predict the probability that the record is fraudulent. This process of applying the business rules is called Scoring. Scoring in real time allows businesses to catch fraudulent records faster, segment new customers and offer them better service, and detect defects quicker. InfoSphere Warehouse Cubing Services - ROLAP Accelerator - Relational data represented dimensionally with planned query optimization - Provides highest data volumes, dimensions & user communities - Ideal for IT enterprise rollouts founded on InfoSphere Warehouse where volumes & scalability are more critical than query performance consistency - High Performance ROLAP solution - Near real time data access with high speed performance - 64bit - Unlimited scalability - Support many users with OLAP queries from many data sources - Integrated IT Tooling - Lower cost of ownership bringing design, query, and performance together - Cognos 8 integration with cubing services - XMLA Provider Support MDX access to cubing services - ie: Excel, Cognos, Alphablox, etc.. The Design Studio tooling enables you to create, edit, import, export, and deploy OLAP models over the relational warehouse schema. Easy to use wizards provide optimization recommendations to dramatically improve the performance of OLAP applications Integrated Data Management Environment with IBM Data Studio - Develop - logical data modeling, physical data modeling, Java Development, Routine Development, Query tuning, etc.. - Protect and Govern - Data masking, data access management, data encryption, etc.. - Deploy - Change Management - Archive - Data archiving, disposal Administer - Database Administration, Data movement and change, Object management, perfrmance management, etc.. Example of new analytical algoritms is Mining time series analysis - Reporting and data mining on a single version of the truth - Design Studio Mining Editor - From Design to embedded applications - - Seamless integration of data modeling, SQL, mining in a common Data Warehouse project - Data exploration components give you an overview on the content and quality of your data. - Interactive flow editor: Stepwise execution. All SQL operators can be reused in the mining context - Two types of mining in InfoSphere Warehouse - Discovery Mining - Finding patterns in data that may be used to guide decisions. - Predictive Mining- Using known results to create models that can predict value InfoSphere Warehouse mining methods - Associations, sequences, Clustering, Classification, regression, ... Modeling. The data mining process starts with historical data being gathered and put through a series of mathematical functions to derive business rules such as: "Customers buying Gouda cheese are likely to also buy mixed nuts 18% of the time." The business rules are collected together into a Model. A model can have a few rule or tens of thousands of rules. Visualization. The business rules must be analyzed and verified by a business analyst to ensure the rules are accurate. IBM offers a unique Visualization tool to assist analysts in evaluating the business rules. Scoring. The verified business rules can be applied to new data to determine the appropriate predicted outcome. For example, a new bank transaction enters the system and the fraud detection rules are applied against the data. The rules will predict the probability that the record is fraudulent. This process of applying the business rules is called Scoring. Scoring in real time allows businesses to catch fraudulent records faster, segment new customers and offer them better service, and detect defects quicker. InfoSphere Warehouse Cubing Services - ROLAP Accelerator - Relational data represented dimensionally with planned query optimization - Provides highest data volumes, dimensions & user communities - Ideal for IT enterprise rollouts founded on InfoSphere Warehouse where volumes & scalability are more critical than query performance consistency - High Performance ROLAP solution - Near real time data access with high speed performance - 64bit - Unlimited scalability - Support many users with OLAP queries from many data sources - Integrated IT Tooling - Lower cost of ownership bringing design, query, and performance together - Cognos 8 integration with cubing services - XMLA Provider Support MDX access to cubing services - ie: Excel, Cognos, Alphablox, etc.. The Design Studio tooling enables you to create, edit, import, export, and deploy OLAP models over the relational warehouse schema. Easy to use wizards provide optimization recommendations to dramatically improve the performance of OLAP applications Integrated Data Management Environment with IBM Data Studio - Develop - logical data modeling, physical data modeling, Java Development, Routine Development, Query tuning, etc.. - Protect and Govern - Data masking, data access management, data encryption, etc.. - Deploy - Change Management - Archive - Data archiving, disposal Administer - Database Administration, Data movement and change, Object management, perfrmance management, etc.. Example of new analytical algoritms is Mining time series analysis

    40. Lightning Fast & Optimized for SAP Better Performance with DB2 Average 40% better performance World record SAP benchmark, outperforming Oracle by 18% and needing only half the processing power! Lower TCO with DB2 Average 40% Storage Savings Average 25% DBA Labor Savings Deep sales and development partnership leading to great results 33% Growth in SAP clients choosing DB2 since 2006 >100 Customers migrated from Oracle to DB2 in past 12 months <Remove slide if customer does not use SAP><Remove slide if customer does not use SAP>

    41. Embrace XML with Industry-Leading XML Storage Easy development and integration No data transformations for XML No complex schemas No parsing upon retrieval Efficient storage Requires only 440GB of raw storage for 1TB XML benchmark Untouchable performance 6,763 XML transactions per second for 1TB XML benchmark Many relational database vendors in today’s competitive marketplace claim to provide “native” XML storage engines, yet in most cases, the XML data is either pushed into a large object (LOB) or shredded into a table. DB2 9.5 is different, it truly does store XML data in its native format and for that reason we call it pureXML. By combining pureXML together with the familiar relational model, DB2 9.5 can be considered a hybrid database and supports a new emerging model for developers. The XML developer will see a sophisticated XML repository that also supports SQL, while the SQL developer will see a sophisticated relational database that also supports XML. The best of both worlds! Furthermore, when compared to standalone XML storage solutions, DB2 provides advantages in terms of scalability and robustness with mature database services such as backup, recovery, and indexing. DB2 also added XQuery and extensions to SQL capabilities to efficiently search and update documents; and XML specific indexing to ensure high performance for those queries. Applications can leverage these facilities to manage and access their XML data efficiently along with any existing relational data using either SQL or XQuery as the query interface depending on their application needs and preferences. DB2 is clearly the leader in this space, enabling rapid development of new types of applications that can combine XML and relational data.Many relational database vendors in today’s competitive marketplace claim to provide “native” XML storage engines, yet in most cases, the XML data is either pushed into a large object (LOB) or shredded into a table. DB2 9.5 is different, it truly does store XML data in its native format and for that reason we call it pureXML. By combining pureXML together with the familiar relational model, DB2 9.5 can be considered a hybrid database and supports a new emerging model for developers. The XML developer will see a sophisticated XML repository that also supports SQL, while the SQL developer will see a sophisticated relational database that also supports XML. The best of both worlds! Furthermore, when compared to standalone XML storage solutions, DB2 provides advantages in terms of scalability and robustness with mature database services such as backup, recovery, and indexing. DB2 also added XQuery and extensions to SQL capabilities to efficiently search and update documents; and XML specific indexing to ensure high performance for those queries. Applications can leverage these facilities to manage and access their XML data efficiently along with any existing relational data using either SQL or XQuery as the query interface depending on their application needs and preferences. DB2 is clearly the leader in this space, enabling rapid development of new types of applications that can combine XML and relational data.

    42. XML on DPF: Scalability Each query run in 2 or 3 equivalent variants: Completely relational (“rel”) Completely XML (“xml”) XML extraction/predicates with relational joins (“xmlrel”) (join queries only) XML SCALES AS WELL AS RELATIONAL Test Environment: DPF Star-schema database with XML, 33M row fact table, 4 and 8 partitions on 12-way AIX machine. Two fact tables (one 33M rows, one smaller), 5 dimensions. Largest dimension 1M rows. All tables with XML. Test database is the same size in both cases -> expect a maximum of 2x speedup from 4 to 8 partitions; possibly more/less due to plan changes Two query sets: Simple: Single-table queries and updates, 2-table joins (left chart) Complex: Multi-table data warehouse queries (right chart) Three “flavors” of each query: (two if nonjoin) Relational XML, including XML joins XML, but with relational joins (recommended for DPF to take advantage of co-location) Speedup Results (Optimally, queries run 2x faster from 4 to 8 partitions) Simple query speedup: Relational avg = 1.77x, XMLrel avg = 1.70x Complex query speedup: XML avg= 1.67, XMLrel avg = 1.9. [Speedup influenced by differing execution plans] Examples of the simple queries: -- count on indexed column/element -- rel select count(*) from senglert.daily_sales where prodkey < 2000; -- xml select count(*) from senglert.daily_sales where xmlexists ('$SALDOC/fact/keys/prodkey[.< 2000]'); -- grouped aggregation -- rel select distinct(tax), count(*) from daily_sales where perkey = 1996001 group by tax; -- xml select distinct(sx.tax), count(*) from daily_sales s, xmltable('$SALDOC/fact/measures/tax' columns tax integer path '.') as sx where perkey = 1996001 group by sx.tax; -- colocated join (colocation does not matter in XML) -- rel SELECT p.category_desc, SUM(s.quantity_sold), SUM(s.shelf_cost), SUM(s.bin_cost_pct_of_sale) FROM senglert.PRODUCT p, senglert.DAILY_SALES s where p.category < 30 and s.perkey = 1996001 and p.PRODKEY=s.PRODKEY GROUP BY p.category_desc; -- xml SELECT px.category_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM senglert.PRODUCT p, senglert.DAILY_SALES s, xmltable('$SALDOC/fact[keys/perkey = 1996001]/measures' columns quantity_sold integer path 'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost', bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale') as sx, xmltable('$PRDDOC/product[category < 30]/category_desc' columns category_desc char(70) path '.') as px where xmlexists('$SALDOC/fact[keys/prodkey/xs:double(.)=$PRDDOC/product/prodkey/xs:double(.)]') GROUP BY px.category_desc; -- xmlrel SELECT px.category_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM senglert.PRODUCT p, senglert.DAILY_SALES s, xmltable('$SALDOC/fact[keys/perkey=1996001]/measures' columns quantity_sold integer path 'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost', bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale' ) as sx, xmltable('$PRDDOC/product[category < 30]/category_desc' columns category_desc char(70) path '.' ) as px where p.PRODKEY=s.PRODKEY GROUP BY px.category_desc; Examples of the complex queries: -- rel SELECT ITEM_DESC, SUM(QUANTITY_SOLD), SUM(SHELF_COST), SUM(BIN_COST_PCT_OF_SALE) FROM PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro WHERE per.PERKEY=s.PERKEY AND prd.PRODKEY=s.PRODKEY AND st.STOREKEY=s.STOREKEY AND pro.PROMOKEY=s.PROMOKEY AND CALENDAR_DATE BETWEEN '01/01/1996' AND '01/04/1996' AND STORE_NUMBER IN ('02', '03') AND PROMODESC IN ('Advertisement', 'Coupon', 'Manager''s Special', 'Overstocked Items') AND CATEGORY=42 GROUP BY ITEM_DESC; -- xml SELECT px.item_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro, xmltable('$SALDOC/fact/measures' columns quantity_sold integer path 'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost', bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale' ) as sx, xmltable('$PRDDOC/product[category=42]/item_desc' columns item_desc char(30) path '.') as px where xmlexists('$PERDOC/period/calendar_date[. >= xs:date("1996-01-01") and . <= xs:date("1996-01-04")]') and xmlexists('$STODOC/store[store_number = "02" or store_number = "03"]') and xmlexists('$PRODOC/promotion[promodesc = "Advertisement" or promodesc = "Coupon" or promodesc = "Manager''s Special" or promodesc = "Overstocked Items"]' ) and xmlexists('$SALDOC/fact/keys[perkey/xs:double(.) = $PERDOC/period/perkey/xs:double(.)]') and xmlexists('$SALDOC/fact/keys[prodkey/xs:double(.) = PRDDOC/product/prodkey/xs:double(.)]') and xmlexists('$SALDOC/fact/keys[storekey/xs:double(.) = $STODOC/store/storekey/xs:double(.)]') and xmlexists('$SALDOC/fact/keys[promokey/xs:double(.) = $PRODOC/promotion/promokey/xs:double(.)]') GROUP BY px.ITEM_DESC; --xmlrel SELECT px.item_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro, xmltable('$SALDOC/fact/measures' columns quantity_sold integer path 'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost', bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale' ) as sx, xmltable('$PRDDOC/product[category=42]/item_desc' columns item_desc char(30) path '.' ) as px where xmlexists('$PERDOC/period/calendar_date[. >= xs:date("1996-01-01") and . <= xs:date("1996-01-04")]') and xmlexists('$STODOC/store[store_number = "02" or store_number = "03" ]') and xmlexists('$PRODOC/promotion[promodesc = "Advertisement" or promodesc = "Coupon" or promodesc = "Manager''s Special" or promodesc = "Overstocked Items"]' ) and per.PERKEY=s.PERKEY AND prd.PRODKEY=s.PRODKEY AND st.STOREKEY=s.STOREKEY AND pro.PROMOKEY=s.PROMOKEY GROUP BY px.ITEM_DESC; Data collected by Susanne Englert Test Environment: DPF Star-schema database with XML, 33M row fact table, 4 and 8 partitions on 12-way AIX machine. Two fact tables (one 33M rows, one smaller), 5 dimensions. Largest dimension 1M rows. All tables with XML. Test database is the same size in both cases -> expect a maximum of 2x speedup from 4 to 8 partitions; possibly more/less due to plan changes Two query sets: Simple: Single-table queries and updates, 2-table joins (left chart) Complex: Multi-table data warehouse queries (right chart) Three “flavors” of each query: (two if nonjoin) Relational XML, including XML joins XML, but with relational joins (recommended for DPF to take advantage of co-location) Speedup Results (Optimally, queries run 2x faster from 4 to 8 partitions) Simple query speedup: Relational avg = 1.77x, XMLrel avg = 1.70x Complex query speedup: XML avg= 1.67, XMLrel avg = 1.9. [Speedup influenced by differing execution plans] Examples of the simple queries: -- count on indexed column/element -- rel select count(*) from senglert.daily_sales where prodkey < 2000; -- xml select count(*) from senglert.daily_sales where xmlexists ('$SALDOC/fact/keys/prodkey[.< 2000]'); -- grouped aggregation -- rel select distinct(tax), count(*) from daily_sales where perkey = 1996001 group by tax; -- xml select distinct(sx.tax), count(*) from daily_sales s, xmltable('$SALDOC/fact/measures/tax' columns tax integer path '.') as sx where perkey = 1996001 group by sx.tax; -- colocated join (colocation does not matter in XML) -- rel SELECT p.category_desc, SUM(s.quantity_sold), SUM(s.shelf_cost), SUM(s.bin_cost_pct_of_sale) FROM senglert.PRODUCT p, senglert.DAILY_SALES s where p.category < 30 and s.perkey = 1996001 and p.PRODKEY=s.PRODKEY GROUP BY p.category_desc; -- xml SELECT px.category_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM senglert.PRODUCT p, senglert.DAILY_SALES s, xmltable('$SALDOC/fact[keys/perkey = 1996001]/measures' columns quantity_sold integer path 'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost', bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale') as sx, xmltable('$PRDDOC/product[category < 30]/category_desc' columns category_desc char(70) path '.') as px where xmlexists('$SALDOC/fact[keys/prodkey/xs:double(.)=$PRDDOC/product/prodkey/xs:double(.)]') GROUP BY px.category_desc; -- xmlrel SELECT px.category_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM senglert.PRODUCT p, senglert.DAILY_SALES s, xmltable('$SALDOC/fact[keys/perkey=1996001]/measures' columns quantity_sold integer path 'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost', bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale' ) as sx, xmltable('$PRDDOC/product[category < 30]/category_desc' columns category_desc char(70) path '.' ) as px where p.PRODKEY=s.PRODKEY GROUP BY px.category_desc; Examples of the complex queries: -- rel SELECT ITEM_DESC, SUM(QUANTITY_SOLD), SUM(SHELF_COST), SUM(BIN_COST_PCT_OF_SALE) FROM PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro WHERE per.PERKEY=s.PERKEY AND prd.PRODKEY=s.PRODKEY AND st.STOREKEY=s.STOREKEY AND pro.PROMOKEY=s.PROMOKEY AND CALENDAR_DATE BETWEEN '01/01/1996' AND '01/04/1996' AND STORE_NUMBER IN ('02', '03') AND PROMODESC IN ('Advertisement', 'Coupon', 'Manager''s Special', 'Overstocked Items') AND CATEGORY=42 GROUP BY ITEM_DESC; -- xml SELECT px.item_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro, xmltable('$SALDOC/fact/measures' columns quantity_sold integer path 'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost', bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale' ) as sx, xmltable('$PRDDOC/product[category=42]/item_desc' columns item_desc char(30) path '.') as px where xmlexists('$PERDOC/period/calendar_date[. >= xs:date("1996-01-01") and . <= xs:date("1996-01-04")]') and xmlexists('$STODOC/store[store_number = "02" or store_number = "03"]') and xmlexists('$PRODOC/promotion[promodesc = "Advertisement" or promodesc = "Coupon" or promodesc = "Manager''s Special" or promodesc = "Overstocked Items"]' ) and xmlexists('$SALDOC/fact/keys[perkey/xs:double(.) = $PERDOC/period/perkey/xs:double(.)]') and xmlexists('$SALDOC/fact/keys[prodkey/xs:double(.) = PRDDOC/product/prodkey/xs:double(.)]') and xmlexists('$SALDOC/fact/keys[storekey/xs:double(.) = $STODOC/store/storekey/xs:double(.)]') and xmlexists('$SALDOC/fact/keys[promokey/xs:double(.) = $PRODOC/promotion/promokey/xs:double(.)]') GROUP BY px.ITEM_DESC; --xmlrel SELECT px.item_desc, SUM(sx.quantity_sold), SUM(sx.shelf_cost), SUM(sx.bin_cost_pct_of_sale) FROM PERIOD per, DAILY_SALES s, PRODUCT prd, STORE st, PROMOTION pro, xmltable('$SALDOC/fact/measures' columns quantity_sold integer path 'quantity_sold', shelf_cost decimal(9,2) path 'shelf_cost', bin_cost_pct_of_sale decimal(9,2) path 'bin_cost_pct_of_sale' ) as sx, xmltable('$PRDDOC/product[category=42]/item_desc' columns item_desc char(30) path '.' ) as px where xmlexists('$PERDOC/period/calendar_date[. >= xs:date("1996-01-01") and . <= xs:date("1996-01-04")]') and xmlexists('$STODOC/store[store_number = "02" or store_number = "03" ]') and xmlexists('$PRODOC/promotion[promodesc = "Advertisement" or promodesc = "Coupon" or promodesc = "Manager''s Special" or promodesc = "Overstocked Items"]' ) and per.PERKEY=s.PERKEY AND prd.PRODKEY=s.PRODKEY AND st.STOREKEY=s.STOREKEY AND pro.PROMOKEY=s.PROMOKEY GROUP BY px.ITEM_DESC; Data collected by Susanne Englert

    43. Real-time data access challenge for businesses Businesses ranging from telecom providers to financial services to web retailers are differentiating themselves by delivering new applications to offer their customers new advanced services To ensure the highest quality of service, these businesses must have extremely fast data access and continuous data availability In-memory database software can be a vital technology for any application that must instantly and reliably supply data while ensuring data consistency Businesses ranging from telecom providers to financial servicesto web retailers are differentiating themselves by delivering new applications to offer their customers new advanced services To ensure the highest quality of service, these businesses must have extremely fast data access and continuous data availability In-memory database software can be a vital technology for any application that must instantly and reliably supply data while ensuring data consistency

    44. Announcing IBM solidDB Universal Cache Our Universal cache is now a leading product in the industry, since it is the first to provide in-memory caching technology that accelerates access to data in most industry leading on disk databases, increasing the performance of these up to ten times. We have the extreme speed of our in memory technology, and we are able to achieve extremely high throughput and extremely low response time. In terms of adaptability, solidDB cache uses the familiar SQL language, it has many options for schema mapping and data transformation and the performance scales both vertically and horizontally. solidDB is ACID compliant, and so it preserves data durability. In addition it has a powerful HA implementation that provides subsecond failover. We’ll discuss all these things in more detail in the next slides.Our Universal cache is now a leading product in the industry, since it is the first to provide in-memory caching technology that accelerates access to data in most industry leading on disk databases, increasing the performance of these up to ten times. We have the extreme speed of our in memory technology, and we are able to achieve extremely high throughput and extremely low response time. In terms of adaptability, solidDB cache uses the familiar SQL language, it has many options for schema mapping and data transformation and the performance scales both vertically and horizontally. solidDB is ACID compliant, and so it preserves data durability. In addition it has a powerful HA implementation that provides subsecond failover. We’ll discuss all these things in more detail in the next slides.

    45. DB2 Virtual Appliances simplify a complex world

    46. DB2 9.7: Use Industry SQL Skills with DB2 DB2 now supports other vendor’s SQL statements, including PL/SQL Easy for your developers to use DB2 Easy for your applications to use DB2 DB2 supports: SQL PL SQL/PSM PL/SQL Packages Built-in packages JDBC SQL*Plus scripts

    47. New Features… Support other vendor’s SQL ? Easy for developers to query DB2 Native execution of all SQL statements ? Fast performance for queries Easily import other vendor’s schemas ? Easy for developers to set up DB2 Support other vendor’s concurrency models ? Easy for developers to use DB2 Support flexible data typing ? Easy for developers to work with DB2 And more…

    48. Babylonian Confusion (aka Lock-In)

    50. PL/SQL in DB2 9.7 Built in PL/SQL native compiler Source level debugging and profiling Both compiles part of core DB2 engine. Produce equivalent quality runtime code. No preference. Tooling/Monitoring hooks in with run time code. So no problems. Both compiles part of core DB2 engine. Produce equivalent quality runtime code. No preference. Tooling/Monitoring hooks in with run time code. So no problems.

    51. SQL Procedure Language (SQL PL) enhancements Advancements in DB2 PL/SQL New SQL, stored procedures and triggers Objective: Explain how is it possible for DB2 to actually support PL/SQL! First we summarized the main problems: (1)semantics and (2) syntax differences between PL/SQL and SQL PL languages. The first element in the solution is that DB2 introduced NATIVE support to most of the Oracle features that used to be a pain during the porting process. This is to show that DB2 has added many new features to Cobra that close the gap between Oracle and DB2 capabilities. Thus, showing that we didn’t just embedded our MTK and emulate the Oracle features like before. The second element and final ingredient solves the syntax problem… Even though you could convert your PL/SQL to the SQL PL syntax… why bother? With the PL/SQL compiler you can keep a single source and reduce even more the porting costs of your application and future versions. In the next slides we’ll explore in more details the PL/SQL support in DB2 along with the new features that make it possible. PL/SQL made its first appearance in Oracle Forms v3. A few years later, it was included in the Oracle Database server v7 (as database procedures, functions, packages, triggers and anonymous blocks) followed by Oracle Reports v2. 1988: Oracle RDBMS version 6 came out with support for PL/SQL embedded within Oracle Forms v3 (version 6 could not store PL/SQL in the database proper), 1992: Oracle version 7 appeared with support for referential integrity, stored procedures and triggers. As of DB2 Version 7.2, a subset of SQL PL is supported in SQL functions and trigger bodies. ~2001 Objective: Explain how is it possible for DB2 to actually support PL/SQL! First we summarized the main problems: (1)semantics and (2) syntax differences between PL/SQL and SQL PL languages. The first element in the solution is that DB2 introduced NATIVE support to most of the Oracle features that used to be a pain during the porting process. This is to show that DB2 has added many new features to Cobra that close the gap between Oracle and DB2 capabilities. Thus, showing that we didn’t just embedded our MTK and emulate the Oracle features like before. The second element and final ingredient solves the syntax problem… Even though you could convert your PL/SQL to the SQL PL syntax… why bother? With the PL/SQL compiler you can keep a single source and reduce even more the porting costs of your application and future versions. In the next slides we’ll explore in more details the PL/SQL support in DB2 along with the new features that make it possible. PL/SQL made its first appearance in Oracle Forms v3. A few years later, it was included in the Oracle Database server v7 (as database procedures, functions, packages, triggers and anonymous blocks) followed by Oracle Reports v2. 1988: Oracle RDBMS version 6 came out with support for PL/SQL embedded within Oracle Forms v3 (version 6 could not store PL/SQL in the database proper), 1992: Oracle version 7 appeared with support for referential integrity, stored procedures and triggers. As of DB2 Version 7.2, a subset of SQL PL is supported in SQL functions and trigger bodies. ~2001

    52. What Runs Directly in DB2 9.7? Want to have 90% of the objects work 100% of the time (no change). (as opposed of 100% of the objects require 10% of change) Goal for ISV: “One source” JDBC: Oracle has made extensions to support their proprietary types. E.g. ref-cursors or VARRAY PACKGAE: Not to be confused with DB2 PACKGAGE. PACKAGE == MODULE Want to have 90% of the objects work 100% of the time (no change). (as opposed of 100% of the objects require 10% of change) Goal for ISV: “One source” JDBC: Oracle has made extensions to support their proprietary types. E.g. ref-cursors or VARRAY PACKGAE: Not to be confused with DB2 PACKGAGE. PACKAGE == MODULE

    53. Concurrency and DB2 9.7 Oracle knows only two isolation levels: Transaction level: New snapshot taken at beg of TX Statement level: New snapshot taken at beg of stmt Vast majority of code written in Statement Level ISO (DEFAULT). So statement that Oracle gives consistent data only true for single statement transaction. Otherwise similar risk at inconsistent results as DB2 RS or CS. Oracle rollback segments are logged => Double the logging and IO for every UDI as compared to DB2 “No free lunch” Oracle knows only two isolation levels: Transaction level: New snapshot taken at beg of TX Statement level: New snapshot taken at beg of stmt Vast majority of code written in Statement Level ISO (DEFAULT). So statement that Oracle gives consistent data only true for single statement transaction. Otherwise similar risk at inconsistent results as DB2 RS or CS. Oracle rollback segments are logged => Double the logging and IO for every UDI as compared to DB2 “No free lunch”

    54. Concurrency Control in DB2 9.7 Reads the currently committed version of a row If uncommitted row-change found use currently committed version Log based No management overhead No performance overhead No wasted memory/storage (no undo tablespace) Assumption is that in most cases an uncommitted change has not been written to disk yet => Log lookup is a memory lookup. May have been flushed however in which case log record is still close (but on disk) and only 1 I/O away. Note: In Oracle I/O needs to be done as well to reassemble version. Perhaps multiple as version ages. Also versioning is at page level (May need to look up rollback segment for rows that have not changed!) Assumption is that in most cases an uncommitted change has not been written to disk yet => Log lookup is a memory lookup. May have been flushed however in which case log record is still close (but on disk) and only 1 I/O away. Note: In Oracle I/O needs to be done as well to reassemble version. Perhaps multiple as version ages.Also versioning is at page level (May need to look up rollback segment for rows that have not changed!)

    55. Currently Committed Competitive Advantage Only incur added processing when a reader and writer are working on the same row No added overhead for a “just in case” collision With Oracle past images are stored in the undo tablespace just in case there is a collision DB2 uses existing log infrastructure to retrieve currently committed data in flight Better performance Lower overhead Simplified management

    56. DB2's PL/SQL compatibility is excellent. We’re looking forward to integrating the current dual source code base into a single one. This will increase our development and testing productivity. The compatibility level that DB2 Cobra achieved is also amazing. Now we can stop our program’s different behaviors, which comes from DBMS’s differences, and this will help us improve the quality of our package. DB2 9.7 Praise for SQL Compatibility

    57. DB2 9.7: % of Supported PL/SQL Statements

    58. Openbravo enabled with DB2 9.7 Map schema and data types Native support Move data Run your shared code Native support – NO EMULATION!! Minor adjustments (Some few exceptions) Test and Tuning Important to focus that we did NOT incorporate the MTK in our DB2 compiler…. Instead, we closed the features gap between Oracle and DB2 by incorporating the missing features, which had to be emulate when translating applications to DB2. Features like: implicit casting, default parameters, anchored data types, etc. Note the manual steps are not necessary anymore. Since DB2 accepts PL/SQL statements, there will be no need to go through the application source code and modifying the SQL statements from Oracle syntax to DB2 syntax. 2 advantages: #1 no modifications necessary to the application itself; #2 no need to maintain 2 sets of SQL code (simplify development efforts) * Of course, just like in older DB2 versions, there is still need for some code changes, eg in a Java application, the need to change connection strings and the JDBC driver in use, for instance. Important to focus that we did NOT incorporate the MTK in our DB2 compiler…. Instead, we closed the features gap between Oracle and DB2 by incorporating the missing features, which had to be emulate when translating applications to DB2. Features like: implicit casting, default parameters, anchored data types, etc. Note the manual steps are not necessary anymore. Since DB2 accepts PL/SQL statements, there will be no need to go through the application source code and modifying the SQL statements from Oracle syntax to DB2 syntax. 2 advantages: #1 no modifications necessary to the application itself; #2 no need to maintain 2 sets of SQL code (simplify development efforts) * Of course, just like in older DB2 versions, there is still need for some code changes, eg in a Java application, the need to change connection strings and the JDBC driver in use, for instance.

    59. Application Migration Evaluation Originally built by Extreme Blue students. Procedural SQL in functions, procedures, and triggers, ddl for tables, types, views, etc. We use the cross compiler from EDB to examine source files. The knowledgebase examines the cross compiler output for unsupported features and an html report is built. We only know what we learned from the development team and by performing the analysis manually for many ISV’s. The report provides information about the unsupported feature, the line number it was found to facilitate migration efforts. Originally built by Extreme Blue students. Procedural SQL in functions, procedures, and triggers, ddl for tables, types, views, etc. We use the cross compiler from EDB to examine source files. The knowledgebase examines the cross compiler output for unsupported features and an html report is built. We only know what we learned from the development team and by performing the analysis manually for many ISV’s. The report provides information about the unsupported feature, the line number it was found to facilitate migration efforts.

    60. Easily Import Other Vendor’s Schemas Easily map schemas and data types from one database to another Simple drag-and-drop operation Automatically map schemas and data types Easy for developers to quickly start using DB2

    61. Performance: Stability Plan lock down Focus on static plan lock down cooperatively between DB2 and Data Studio Data Studio will provide mechanism to capture dynamic workload and convert to static ALTER PACKAGE statement to specify ACCESS PLAN REUSE and OPTIMIZATION PROFILE BIND/REBIND statement to specify ACCESS PLAN REUSE Optimizer Guideline Enhancements New guidelines Dynamic plan lockdown using Data Studio Tooling focus to capture, modify and install profiles for queries For BIND The default is NO. If YES is specified and there is an existing package, then the query compiler will attempt to reuse the access plan for every statement that can be matched with a statement in the new bind file. For a statement to match the statement text must be identical and the section number (SECTNO column in SYSCAT.STATEMENTS) for the statement in the existing package must match what the section number will be for the statement in the new package. For REBIND The default is the value used on the previous BIND or REBIND command or ALTER PACKAGE statement. (The value can be determined by examining the APREUSE column for the package in SYSCAT.PACKAGES – ‘N’ for NO, “Y’ for YES).For BIND The default is NO. If YES is specified and there is an existing package, then the query compiler will attempt to reuse the access plan for every statement that can be matched with a statement in the new bind file. For a statement to match the statement text must be identical and the section number (SECTNO column in SYSCAT.STATEMENTS) for the statement in the existing package must match what the section number will be for the statement in the new package. For REBIND The default is the value used on the previous BIND or REBIND command or ALTER PACKAGE statement. (The value can be determined by examining the APREUSE column for the package in SYSCAT.PACKAGES – ‘N’ for NO, “Y’ for YES).

    62. Where is my DB application spending its time? The main problem is that our monitoring tools are only looking at the activity on the data server itself. However, there are many potential points of failure, which we do not monitor at all. It is extremely difficult for the DBA to preclude that it is not a data server problem. In today‘s environment, applications are no longer running exclusively on the data server system. The network between data server and application is sometimes a problem. Also some applications have introduced techniques such as connection pooling and connection concentrator, which can have a significant impact on the experienced response time – even if it is not caused by the data server itself. And of course we also have to address problems which are caused by the applications business logic, or by overloaded client systems. Therefore, we really need a solution which shows us all contributors to the response time the application is experiencing.The main problem is that our monitoring tools are only looking at the activity on the data server itself. However, there are many potential points of failure, which we do not monitor at all. It is extremely difficult for the DBA to preclude that it is not a data server problem. In today‘s environment, applications are no longer running exclusively on the data server system. The network between data server and application is sometimes a problem. Also some applications have introduced techniques such as connection pooling and connection concentrator, which can have a significant impact on the experienced response time – even if it is not caused by the data server itself. And of course we also have to address problems which are caused by the applications business logic, or by overloaded client systems. Therefore, we really need a solution which shows us all contributors to the response time the application is experiencing.

    63. Moving away from System Monitor Begin to move away from system monitor and snapshot technology for database monitoring Moving towards SQL access direct to internal memory Continuing the trend of WLM table functions in DB2 9.5 New, parallel monitoring infrastructure introduced which is independent of system monitor infrastructure i.e. not connected to existing system monitor infrastructure such as monitor switches Aim is to replace most commonly used database snapshot mechanisms over time Only a few will be explicitly deprecated in Cobra but alternatives will be provided Snapshot still needed in future for instance level information

    64. “Time Spent” Metrics (example)

    65. “Component Time” Metrics (example)

    66. Monitoring and Problem Determination Levels of Monitoring System Level Monitoring Effort expended by DB2 to process application requests Bufferpool, Commit, Compilation, Log, Package cache, etc. Aggregation by Service Subclass, Connection, Workload Definition, Unit of Work Example: Bufferpool Physical Read Wait Time Activity Level Monitoring Subset of metrics from above that measure statement level execution Aggregation by section in package cache Database Object Monitoring Operations performed on particular database objects. Table, index, buffer pool, tablespace, and container Most metrics are counters; Example: # of table scans for Table A Can drill down (e.g. buffer pool => tablespace => container OR table)

    67. Peace of Mind with World Class Audit and Security Meet your compliance and governance needs Understand risks and security issues Auditing redesigned for fast performance Easily find out Who, What, Where, When, and How you may have a breach Reduce risk of unauthorized access Precise Control Encryption Label-Based Access Control (LBAC) 44% of IT cited authorized access as key security challenge ? Computing Technology Industry Association (CompTIA) study Top challenge for 43% of CFOs is governance, controls, and risk management ? IBM CFO Survey: Current state & future direction, IBM Business Consulting Services Role membership is assigned to a user or group and role privileges are available for use in any and all activities under taken by the connection user. The existence of roles will allow for SQL objects like views, packages, or triggers to be created by users who acquire the needed privileges through roles assigned to them. Advantages of database roles Simplification of the administration of privileges and authorities in the database Roles are managed inside the database and, unlike groups, are considered by DB2 when creating views, triggers, MQTs, static SQL and SQL routines Database roles have also been introduced in DB2 9.5. Role membership is assigned to a user or group and role privileges are available for use in any and all activities undertaken by the connection user. The existence of roles will allow for SQL objects like views, packages, or triggers to be created by users who acquire the needed privileges through roles assigned to them and will simplify overall security administration. A key point is that DBAs can now manage DB2 security using database roles without needing ROOT authority for the OS Groups. ____ Effective and Efficient Auditing DB2 gives you control over critical information of who, when and from where are people accessing DB2 Who: tracking of all connections and authorizations What: statement text and DDL tracking Where: application id and TCP/IP of originating request When: timestamp capture of key events How: record all authorization checks for database actions Why: that’s for the auditors to find out… Identity Assertion Easily group and track your users through an N-tier architecture Trusted Context establishes a trust relationship between the database and an external entity such as an application server This slide describes some of the enhancements made to DB2 9.5 that help it secure your data. The new DB2 9.5 audit facility efficiently and effectively produces all the audit records based on the audit configuration specified to control critical information of who, when, and from where people are accessing DB2. The slide shows examples of what can be audited. The audit facility has also been redesigned to focus on performance and fine grained auditing (FGA). The records generated from this facility are kept in an audit log file. The analysis of these records can reveal usage patterns which would identify system misuse. The audit facility provides the ability to audit at both the instance and the individual database level, independently recording all instance and database level activities with separate logs for each. The security administrator (SECADM) determines what should be audited by creating audit policies and then applying these audit policies at a very granular level to whatever database object they are interested in monitoring. The new Identify Assertion capability provides a means whereby the end-user identity in a three-tier environment can be easily and efficiently propagated to the database server. It also introduces the concept of a trusted context between a database server and a specific application tier. Trusted context solves two key security problems: (1) the use of a common user ID by application servers when accessing the database and (2) the lack of control on when privileges can be applied to a connected user. Identity Assertion can also be used to identify and prevent users from coming in from certain locations (for example, a specific IP address). ____ Label Based Access Control (LBAC) A “label” is associated with both users and data objects Rules for comparing users and data labels allows access controls to be applied at the row and column level Labels may consist of multiple components Hierarchical, group, or tree types Row labels appear as a single additional column in a protected table, regardless of the number of label components User labels are granted by a security administrator DB2 also has Label Based Access Control (LBAC) which was introduced in DB2 9. Label based access control is powerful form of content based authorization. It is a flexible implementation of Mandatory access control (MAC) and similar to other MAC implementations, is based on the use of security labels. These security labels determines access to all data within an LBAC protected table, and the database manager enforces strictly enforces this. LBAC is an additional layer of security built directly inside the database management system. It complements – not replaces – the other access control mechanisms of the database ( authentication, authorization, etc.). It provides the ability to granularly control access at the individual row and column level, and to have the system enforce a model where no one – not even the DBA – has any inherent privileges to access the data inside the database. 44% of IT cited authorized access as key security challenge ? Computing Technology Industry Association (CompTIA) study Top challenge for 43% of CFOs is governance, controls, and risk management ? IBM CFO Survey: Current state & future direction, IBM Business Consulting Services Role membership is assigned to a user or group and role privileges are available for use in any and all activities under taken by the connection user. The existence of roles will allow for SQL objects like views, packages, or triggers to be created by users who acquire the needed privileges through roles assigned to them. Advantages of database roles Simplification of the administration of privileges and authorities in the database Roles are managed inside the database and, unlike groups, are considered by DB2 when creating views, triggers, MQTs, static SQL and SQL routines Database roles have also been introduced in DB2 9.5. Role membership is assigned to a user or group and role privileges are available for use in any and all activities undertaken by the connection user. The existence of roles will allow for SQL objects like views, packages, or triggers to be created by users who acquire the needed privileges through roles assigned to them and will simplify overall security administration. A key point is that DBAs can now manage DB2 security using database roles without needing ROOT authority for the OS Groups. ____ Effective and Efficient Auditing DB2 gives you control over critical information of who, when and from where are people accessing DB2 Who: tracking of all connections and authorizations What: statement text and DDL tracking Where: application id and TCP/IP of originating request When: timestamp capture of key events How: record all authorization checks for database actions Why: that’s for the auditors to find out… Identity Assertion Easily group and track your users through an N-tier architecture Trusted Context establishes a trust relationship between the database and an external entity such as an application server This slide describes some of the enhancements made to DB2 9.5 that help it secure your data. The new DB2 9.5 audit facility efficiently and effectively produces all the audit records based on the audit configuration specified to control critical information of who, when, and from where people are accessing DB2. The slide shows examples of what can be audited. The audit facility has also been redesigned to focus on performance and fine grained auditing (FGA). The records generated from this facility are kept in an audit log file. The analysis of these records can reveal usage patterns which would identify system misuse. The audit facility provides the ability to audit at both the instance and the individual database level, independently recording all instance and database level activities with separate logs for each. The security administrator (SECADM) determines what should be audited by creating audit policies and then applying these audit policies at a very granular level to whatever database object they are interested in monitoring. The new Identify Assertion capability provides a means whereby the end-user identity in a three-tier environment can be easily and efficiently propagated to the database server. It also introduces the concept of a trusted context between a database server and a specific application tier. Trusted context solves two key security problems: (1) the use of a common user ID by application servers when accessing the database and (2) the lack of control on when privileges can be applied to a connected user. Identity Assertion can also be used to identify and prevent users from coming in from certain locations (for example, a specific IP address). ____ Label Based Access Control (LBAC) A “label” is associated with both users and data objects Rules for comparing users and data labels allows access controls to be applied at the row and column level Labels may consist of multiple components Hierarchical, group, or tree types Row labels appear as a single additional column in a protected table, regardless of the number of label components User labels are granted by a security administrator DB2 also has Label Based Access Control (LBAC) which was introduced in DB2 9. Label based access control is powerful form of content based authorization. It is a flexible implementation of Mandatory access control (MAC) and similar to other MAC implementations, is based on the use of security labels. These security labels determines access to all data within an LBAC protected table, and the database manager enforces strictly enforces this. LBAC is an additional layer of security built directly inside the database management system. It complements – not replaces – the other access control mechanisms of the database ( authentication, authorization, etc.). It provides the ability to granularly control access at the individual row and column level, and to have the system enforce a model where no one – not even the DBA – has any inherent privileges to access the data inside the database.

    68. Security - Separation of Duties Remove implicit DBADM from SYSADM Remove ability to grant DBADM and SECADM from SYSADM Allow SECADM to be granted to groups and roles Allow SECADM to GRANT/REVOKE database and object auth Setup up a DBADM that does not have the capability to grant and revoke privileges or access data GRANT DBADM ON DATABASE WITHOUT ACCESSCTRL TO USER JOE GRANT DBADM ON DATABASE WITHOUT DATAACCESS TO USER JOE Remove secondary grants implicitly done when DBADM granted BINDADD, CONNECT, CREATETAB, IMPLICIT_SCHEMA, LOAD,… Introduce new authorities EXPLAIN, DATAACCESS, ACCESSCTRL, SQLADM, WLMADM authorities SQLADM authority can perform event monitor commands, holds EXPLAIN privilege, and can execute RUNSTATS Implicit secondary grants for DBADM: BINDADD CONNECT CREATETAB CREATE_EXTERNAL_ROUTINE CREATE_NOT_FENCED_ROUTINE IMPLICIT_SCHEMA QUIESCE_CONNECT LOAD Implicit secondary grants for DBADM: BINDADD CONNECT CREATETAB CREATE_EXTERNAL_ROUTINE CREATE_NOT_FENCED_ROUTINE IMPLICIT_SCHEMA QUIESCE_CONNECT LOAD

    69. Robust Operation with High Availability Manage planned and unplanned outages System maintenance Local outages Easy to add No rewrite of your application No specialized hardware Setup in only minutes with graphical wizard Systems can be in different locations Ultra-fast (seconds) HADR was first introduced in Informix around 1994, and was incorporated into DB2 in V8.2. The concept is simple, as two servers are kept in sync by HADR by shipping log entries from the primary to the secondary. If the primary server fails, the standby server is ready to take over and can do so in a very short time – typically 10 to 15 seconds. The primary goals of HADR were to deliver very fast failover and to be easy to administer. We also wanted to ensure performance impact was minimal and that applications could fail over transparently. HADR also has advantages over other HA clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC). Automatic client re-route is part of DB2. The standby server location is stored in the system database directory, and loaded into the client upon establishing a database connection. If the primary server fails, a special SQLcode is returned which automatically results in the client establishing a connection to the standby server. Working together with HADR, this provides transparent failover to the standby server. Add High Availability to your Legacy applications Easy to configure High Availability and Disaster Recovery Easy to add AFTER you write your application without any coding changes Setup in minutes – Graphical Wizard Allows ultra-fast failover (seconds) Systems can be in different geographic locations What do I get with HADR? Ability to manage planned and unplanned outages Seamless DB2 and system upgrades Multiple levels of HA to match your customers needs Automatic client re-routing HADR was first introduced in Informix around 1994, and was incorporated into DB2 in V8.2. The concept is simple, as two servers are kept in sync by HADR by shipping log entries from the primary to the secondary. If the primary server fails, the standby server is ready to take over and can do so in a very short time – typically 10 to 15 seconds. The primary goals of HADR were to deliver very fast failover and to be easy to administer. We also wanted to ensure performance impact was minimal and that applications could fail over transparently. HADR also has advantages over other HA clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC). Automatic client re-route is part of DB2. The standby server location is stored in the system database directory, and loaded into the client upon establishing a database connection. If the primary server fails, a special SQLcode is returned which automatically results in the client establishing a connection to the standby server. Working together with HADR, this provides transparent failover to the standby server. Add High Availability to your Legacy applications Easy to configure High Availability and Disaster Recovery Easy to add AFTER you write your application without any coding changes Setup in minutes – Graphical Wizard Allows ultra-fast failover (seconds) Systems can be in different geographic locations What do I get with HADR? Ability to manage planned and unplanned outages Seamless DB2 and system upgrades Multiple levels of HA to match your customers needs Automatic client re-routing

    70. HADR Delivers Fast and Easy Failover HADR provides extremely high availability with failover times measured in seconds. It also has advantages over other clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC). The primary server starts on the left. But if it fails, the primary moves over to the machine on the right automatically. If the left side machine is repaired, you can start HADR on that server and it will automatically resynchronize itself and become a standby server.HADR provides extremely high availability with failover times measured in seconds. It also has advantages over other clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC). The primary server starts on the left. But if it fails, the primary moves over to the machine on the right automatically. If the left side machine is repaired, you can start HADR on that server and it will automatically resynchronize itself and become a standby server.

    71. Even Better Resource Utilization – In Development HADR provides extremely high availability with failover times measured in seconds. It also has advantages over other clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC). The primary server starts on the left. But if it fails, the primary moves over to the machine on the right automatically. If the left side machine is repaired, you can start HADR on that server and it will automatically resynchronize itself and become a standby server.HADR provides extremely high availability with failover times measured in seconds. It also has advantages over other clusters like Oracle RAC because HADR supports rolling upgrades, can protect against storage failure and delivers 100% performance if one node fails (none of these are available with Oracle RAC). The primary server starts on the left. But if it fails, the primary moves over to the machine on the right automatically. If the left side machine is repaired, you can start HADR on that server and it will automatically resynchronize itself and become a standby server.

    72. Gain Control with DB2 Workload Management Regain Control of your Complex Systems Adapt to new business workloads without hurting your existing workloads Let the business tell you what’s critical Help ensure you meet your Service Level Agreements DB2 can automatically alter resource allocations and priorities as needs change throughout the day Set controls based on users and applications Improve critical responsiveness of heavily used systems DB2 Workload Management Deeply embedded in the database engine DB2 9.5 workload manager (WLM) is a resource management and monitoring tool that is built right into the database engine. It is brand new with DB2 9.5. The primary customer benefits include tangible relief in the areas of CPU control, detection/control of rogue queries (limit excessive, unexpected resource consumption) and monitoring of work on the database. The new WLM architecture is also designed with integration of external WLM products such as AIX workload manager in mind, allowing DB2 to take advantage of their capabilities (on platforms where they exist) while improving the end-to-end workload management story. WLM is simple to administer, based on workload identification and then prioritization. An excellent introductory article on DB2 9.5 workload management is available for download at http://www.ibm.com/developerworks/forums/servlet/JiveServlet/download/1116-166950-13965175-231542/Introduction%20to%20DB2%20workload%20management.pdf These new workload management capabilities help to reduce the total cost of ownership by ensuring proper resource allocation and utilization, which can help meet service level agreements. DBA’s won’t have to worry about someone hogging CPU time with a monster query, and they can be sure that high priority work gets done first. DB2 9.5 workload manager (WLM) is a resource management and monitoring tool that is built right into the database engine. It is brand new with DB2 9.5. The primary customer benefits include tangible relief in the areas of CPU control, detection/control of rogue queries (limit excessive, unexpected resource consumption) and monitoring of work on the database. The new WLM architecture is also designed with integration of external WLM products such as AIX workload manager in mind, allowing DB2 to take advantage of their capabilities (on platforms where they exist) while improving the end-to-end workload management story. WLM is simple to administer, based on workload identification and then prioritization. An excellent introductory article on DB2 9.5 workload management is available for download at http://www.ibm.com/developerworks/forums/servlet/JiveServlet/download/1116-166950-13965175-231542/Introduction%20to%20DB2%20workload%20management.pdf These new workload management capabilities help to reduce the total cost of ownership by ensuring proper resource allocation and utilization, which can help meet service level agreements. DBA’s won’t have to worry about someone hogging CPU time with a monster query, and they can be sure that high priority work gets done first.

    73. Integrated Workload Management Two thresholds for queing: 1) start queing 2) size of queue Thresholds: temp, rows returned, elapsed time Future: aggregate temp, rows read, cpu time, locks, etc log Two thresholds for queing: 1) start queing 2) size of queue Thresholds: temp, rows returned, elapsed time Future: aggregate temp, rows read, cpu time, locks, etc log

    74. Priority Aging Concept The basic setup is one where all incoming work comes into a service subclass defined as high priority (using agent priority). In this example, every activity gets to spend its first five seconds of CPU time in the high priority service class. Then the threshold kicks in and the activity is remapped to the medium priority service class action. And so on. When work reaches the lowest priority service class, it could stay there till completion or your could add an additional threshold on the low priority class that moved the activity back up to the high priority class again after some other value of CPUTIME (e.g. 60 seconds) to start the whole cycle over again. This is just one example. You can do the same thing using the ROWSREAD in Service Class threshold. The advantages of this approach is that it is simpler configuration while still introducing a very natural prioritization approach which will bias itself in favour of short running queries. The basic setup is one where all incoming work comes into a service subclass defined as high priority (using agent priority). In this example, every activity gets to spend its first five seconds of CPU time in the high priority service class. Then the threshold kicks in and the activity is remapped to the medium priority service class action. And so on. When work reaches the lowest priority service class, it could stay there till completion or your could add an additional threshold on the low priority class that moved the activity back up to the high priority class again after some other value of CPUTIME (e.g. 60 seconds) to start the whole cycle over again. This is just one example. You can do the same thing using the ROWSREAD in Service Class threshold. The advantages of this approach is that it is simpler configuration while still introducing a very natural prioritization approach which will bias itself in favour of short running queries.

    75. Increase Collaboration with Integrated Tools Integrated data management tools IBM Infosphere Data Architect, IBM Data Studio, IBM Optim solutions Cover all phases of data lifecycle Integrated tools allows collaboration across roles Business analyst, database architect, database administrator, application developer, etc. Support all major database platforms IBM DB2, Informix, Oracle Database, SQL Server, etc. IBM has a grand vision to create a single toolset that can manage every aspect of the data life cycle. This toolset would offer built-in compliance and easy integration with other IBM tools. The first step was taken in October 2007 with the release of Data Studio 1.1. The free download, the first incarnation of the vision, supported DB2 and Informix Dynamic Server and included entity relationship diagramming, an SQL builder, an XML editor, pureQuery for Java, security access tools, data management tools, and other features. Work is underway to broaden the Data Studio family to include all the tools needed in every phase of data's existence. To understand the benefits of the complete toolset vision, it helps to picture the data life cycle as a whole. As IBM sees it, data goes through five distinct stages during its life: design, development, deployment, management, and governance. In the design stage, business analysts and database architects create the data models that will be used to develop applications and databases that meet the organization's needs. IBM's Data Studio tools for this stage currently include logical modeling and entity relationship diagramming, with physical modeling capabilities coming soon. The next two stages, develop and deploy, are closely linked, and the tasks associated with both stages are frequently performed by people whose titles include the word "developer." Under IBM's framework, the development stage refers specifically to application development, often involving Java, .Net, PHP, Ruby, or COBOL programming. The deploy stage covers database development, usually in SQL or another query language. Manage encompasses the tasks usually performed by a DBA. Those include things like day-to-day administration, configuration, performance tuning, change management, backup and recovery, and so on. In the final stage, govern, security administrators take responsibility both for the security of the organization's data resources and ensuring that the organization complies with all relevant regulations. This is also the area that includes auditing, encryption, archiving, and, ultimately, data destruction once the retention period has elapsed. More components will be integrated into the various life-cycle stages IBM Data Studio manages in the coming year. Some will be components of the free download, others will be paid features. Here's what features are likely to be added in the near future. Rational Data Architect will become a fully integrated modeling component of Data Studio and will remain a paid feature. The Data Studio Administration Console, which provides monitoring capabilities, will be a component of the free offering and is currently available as a technology preview. Paid features for performance management, change management, query tuning, problem determination using pureQuery, and a test data generator are all likely additions in 2008. IBM has a grand vision to create a single toolset that can manage every aspect of the data life cycle. This toolset would offer built-in compliance and easy integration with other IBM tools. The first step was taken in October 2007 with the release of Data Studio 1.1. The free download, the first incarnation of the vision, supported DB2 and Informix Dynamic Server and included entity relationship diagramming, an SQL builder, an XML editor, pureQuery for Java, security access tools, data management tools, and other features. Work is underway to broaden the Data Studio family to include all the tools needed in every phase of data's existence. To understand the benefits of the complete toolset vision, it helps to picture the data life cycle as a whole. As IBM sees it, data goes through five distinct stages during its life: design, development, deployment, management, and governance. In the design stage, business analysts and database architects create the data models that will be used to develop applications and databases that meet the organization's needs. IBM's Data Studio tools for this stage currently include logical modeling and entity relationship diagramming, with physical modeling capabilities coming soon. The next two stages, develop and deploy, are closely linked, and the tasks associated with both stages are frequently performed by people whose titles include the word "developer." Under IBM's framework, the development stage refers specifically to application development, often involving Java, .Net, PHP, Ruby, or COBOL programming. The deploy stage covers database development, usually in SQL or another query language. Manage encompasses the tasks usually performed by a DBA. Those include things like day-to-day administration, configuration, performance tuning, change management, backup and recovery, and so on. In the final stage, govern, security administrators take responsibility both for the security of the organization's data resources and ensuring that the organization complies with all relevant regulations. This is also the area that includes auditing, encryption, archiving, and, ultimately, data destruction once the retention period has elapsed. More components will be integrated into the various life-cycle stages IBM Data Studio manages in the coming year. Some will be components of the free download, others will be paid features. Here's what features are likely to be added in the near future. Rational Data Architect will become a fully integrated modeling component of Data Studio and will remain a paid feature. The Data Studio Administration Console, which provides monitoring capabilities, will be a component of the free offering and is currently available as a technology preview. Paid features for performance management, change management, query tuning, problem determination using pureQuery, and a test data generator are all likely additions in 2008.

    76. DB2 Saves 37% for Retail Grocery retailer; 600+ stores $10 billion sales; 50,000+ employees Analytical data warehouse system Databases 2,052.9 1,4357 Disk systems 1,870.3 987.8 Tape systems 358.0 269.9 Database servers 1,020.8 532.1 DBA personnel 876.7 640.4 TOTAL ($000) 6,178.7 3,542.1Databases 2,052.9 1,4357 Disk systems 1,870.3 987.8 Tape systems 358.0 269.9 Database servers 1,020.8 532.1 DBA personnel 876.7 640.4 TOTAL ($000) 6,178.7 3,542.1

    77. DB2 Saves 34% for Telecoms Wireless & Internet access services; 2 million subscribers $1.5 billion sales; 5,000+ employees Billing, CRM, operational systems & data store Databases 4,807.2 2,352.2 Disk systems 6,182.2 3,898.9 Tape systems 487.8 243.9 Database servers 2,146.4 1,589.5 DBA personnel 1,315.1 960.5 TOTAL ($000) 14,938.7 9,045.0 Databases 4,807.2 2,352.2 Disk systems 6,182.2 3,898.9 Tape systems 487.8 243.9 Database servers 2,146.4 1,589.5 DBA personnel 1,315.1 960.5 TOTAL ($000) 14,938.7 9,045.0

    78. DB2 Saves 38% for Financial Services Diversified retail bank; 200+ branches; 5+ million accounts $30 billion assets; 7,500 employees Industry standards-based (XML) transactional system Databases 2,574.8 1,341.1 Disk systems 3,792.8 2,174.0 Tape systems 282.3 151.7 Database servers 1,128.7 740.7 DBA personnel 657.5 533.6 TOTAL ($000) 8,436.1 4,941.1Databases 2,574.8 1,341.1 Disk systems 3,792.8 2,174.0 Tape systems 282.3 151.7 Database servers 1,128.7 740.7 DBA personnel 657.5 533.6 TOTAL ($000) 8,436.1 4,941.1

    79. Attend IDUG 2009 – Europe (5-9 October, Rome, Italy) Increase your skills with in-depth technical education more than 100 technical sessions Network with hundreds of DB2 professionals, experts, IBM executives and other top industry representatives

    80. IBM DB2: Lower Cost Made Easy

More Related