330 likes | 804 Views
Teradata Database Release 12.0. David Laband Certified Teradata Master, V2R3 & V2R5 423-508-0389(office); 423-400-7017 (cell) David.Laband@Teradata.Com November 6, 2007. Teradata 12.0: Today’s Agenda…. Introduction: Teradata Focus on Active Enterprise Intelligence
E N D
Teradata Database Release 12.0 • David Laband • Certified Teradata Master, V2R3 & V2R5 • 423-508-0389(office); 423-400-7017 (cell) • David.Laband@Teradata.Com • November 6, 2007
Teradata 12.0: Today’s Agenda… • Introduction: • Teradata Focus on Active Enterprise Intelligence • Teradata Technical “Roadmap” • Current Hardware Offerings Quick Summary • Teradata Database 12.0: • Features & Functions • Updated PS Offerings
Teradata’s Active Data WarehouseAn “Active” Extension of the Enterprise Data Warehouse • Active Load • Intra-day data acquisition; Mini-batch to near-real-time (NRT) trickle data feeds measured in minutes or seconds • Active Workload Management • Dynamically manage system resources for optimum performance and resource utilization supporting a mixed-workload environment Integrate Once “Active” • Active Access • Front-Line operational decisions or services supported by NRT access; Service Level Agreements of 5 seconds or less • Active Enterprise Integration • Integration into the Enterprise Architecture for delivery of intelligent decisioning services Operational Strategic • Active Events • Proactive monitoring of business activity initiating intelligent actions based on rules and context; to systems or users supporting an operational business process • Active Availability • Business Continuity to support the requirements of the business • (up to 7X24X365) Use Many
Teradata Roadmap 2007 – 2008 Teradata 12.0 Teradata 13.0 TeradataSoftware • Teradata Database 12.0 • Teradata Tools & Utilities • TTU 8.2 on Windows Vista -June GCA • TTU 12.0 on V2R6.2,6.1,6.0 - July GCA (excluding TASM) • TTU12.0 on TD 12.0 - Sept GCA (including TASM) Teradata Database 13.0 Teradata Tools & Utilities 13.0 UNIX OS Teradata Server 32-bit/64-bit SMP/MPP – MP-RAS – Linux BYNET Teradata Enterprise Storage 7.0 EMC DMX-3 Storage Teradata Server 32-bit/64-bit SMP/MPP – MP-RAS – Linux BYNET Teradata Enterprise Storage EMC DMX-3 Storage WindowsOS Teradata Server 64-Bit SMP/MPP – Windows Server 2003 BYNET Teradata Enterprise Storage EMC DMX-3 Storage Teradata Server 32-Bit/64-Bit SMP/MPP – Windows Server 2003 BYNET Teradata Enterprise Storage 7.0 EMC DMX-3 Storage BARStorage BakBone – NetVault 7.4.5 – Q2 BakBone – NetVault 8.0 – Q4 Veritas – NetBackup 6.5 IBM (TSM 5.4) Online Archive Protegrity BAR Encryption 2.0 BakBone Veritas IBM (TSM) Teradata Dual Active Solution 13.0 – Non-Data Object Replication - Teradata Multi-System Manager v1.0 (Re-planning date TBD) Dual Active Teradata Dual Active Solution 12.0 – Enhanced GoldenGate Replication Focus Active Enabled and Manageability Active Enabled and Manageability Note: Bold Text Indicates NEW Product
Teradata 5500 Server PlatformBuilt for a Purpose Strategic Intelligence Operational Intelligence 5500 Server – Teradata Platform Start Small and Grow • Performance: Up to 2.6X performance improvement of 5500H node over 5450H • Scalability: • Incremental growth per node of up to 1.47 TB User Data @ Design Center configuration • Scale up to 1024 nodes = 1.5 Petabytes • High Performance, Entry Level & Coexistence models • Availability:Performance continuity with “small clique” Hot Standby Node Offer • Investment Protection: Investment protection with previous generations
Dual BYNET Interconnects 5500 Platform Overview 5500 Node 5500 Node 5500 Node 5500 Node Processors Processors Processors Processors Storage NEW Intel Dual Core Technology • Dual Core Intel Xeon Processor • 2.66 GHz Clock • 4 MB Level 2 Cache • Extended Memory 64 Technology • New 5500 node chassis • 5400 Node cabinet compatible • Operating Systems supported: • SUSE Linux for 64 bit • MP-RAS 3.03 for 32 bit • Windows Server 2003 – 32 bit • Windows Server 2003 – 64 bit • BYNET V3 • 1 to 1024 nodes in a system • Disk Storage • Teradata Enterprise Storage - Teradata 6843 • EMC DMX-3 NEW NEW NEW
Introducing Multiple Core Terminology CORE CORE CORE CORE CORE CORE Dual Core Dual Cores New Direction in Microprocessor Industry • Multiplecomputing Cores to dramatically increase performance • Leverage Moore’s Law and use massive numbers of transistors to build multiple Core processors • Clock rate increases are no longer the prime driver of performance improvement 64 Bit Dual Core 53XX/54XX NODE • Teradata 53XX/54XX • “2-way” processor node • Single Core Processor • 2 Cores in Node SOCKET SOCKET Processor Processor 5500 NODE • Teradata 5500 • “2-way” node • Dual Core Processor • 2 or 4 Cores in Node SOCKET SOCKET Processor Processor
Teradata 12: Why A New Numbering Schema ? Consistency and Sanity. • One Release, One Number Begins with Teradata 12.0: • Version 1 had 5 major releases up to V1R5. • Version 2 had 6 major releases up to V2R6.2 • So, after……..11 major releases, Teradata12 debuts in 2007. • Teradata 12.0: • Great New Functionality. • Consistent Product Numbering. • One release number for all included software: • Integrated, Tested, & Consistent. • Includes Teradata RDBMS & all Teradata Tools and Utilities. • Q1-2008: Teradata 12.0 GCA (plan)
Teradata Database 12.0 Features Chart Version: February 27, 2007 • Performance • OCES (phase 3): Optimizer Cost Estimation Subsystem • Statistics enhancements: • Increase statistics intervals • Extrapolate statistics outside range (e.g. DATE) • Collect stats for multi-column NULL values • Collect AMP Level statistics values • Enhanced query rewrite capability • Parameterized statement caching improvements • Hash bucket expansion • Multi-level Partitioned Primary • Index (PPI) • Active Enable • Online Archive • Replication Scalability • Restartable Scandisk • Bulk SQL error logging tables • Full ANSI Merge-Into SQL capability • CheckTable utility performance enhancements • Quality & Supportability • Improved SQLGen/DataGen test tools • Dispatcher Fault Isolation • Ease of Use • Additional EXPLAIN plan • details • TASM enhancements: • Query Banding • Provide for Open API SQL capability for Teradata Dynamic Workload Management • Enhanced monitoring • Data collection: DBQL, Resusage • Collection of system conditions & actions • Dynamic load utility management • Index wizard support for PPI • SQL invocation via External Stored Procedures • Stored Procedure result sets • Dynamic Result Row Specification on Table Functions • Normalized AMPusage View for coexistence • Enterprise Fit • JAVA SP’s (with JDBC) (Linux and Windows) • Cursor positioning for multi-statement requests • UNICODE Data Dictionary • UNICODE support for password control and encryption • Custom password Dictionary support • New password encryption algorithm • UNIX/Linux Kerberos Authentication for Windows Clients • Support for SLES 10
Performance:Multi-Level Partitioned Primary Index • Description: • Extend the existing Partitioned Primary Index (PPI) capability to support and allow for the creation of a table or non-compressed join index with a Multi-Level Partitioned Primary Index (ML-PPI). • Benefit: • The use of ML-PPI on table(s) affords a greater opportunity for the Teradata Optimizer to achieve a greater degree of partition elimination at a more granular level which in turn results in achieving a greater level of query performance. • Considerations: • Modification of existing queries to take advantage of ML-PPI is not required, ever. The Teradata Optimizer determines whether or not the Index is usable as part of the best-cost query planning process and will engage the use of the Index as part of the plan to execute a given query automatically.
Performance:Increase Statistics Intervals • Description: • Improve the accuracy of collected statistics by increasing the number of statistics intervals from 100 to 200providing for a more detailed picture of the distribution of actual column data content that would be available to the optimizer for estimation purposes. • Benefit: • The increase in the number of statistics intervals will have a positive impact for queries that contain a large numbers of INLIST values or conditions with many distinct values. More intervals now can represent more distinct values more accurately which improves the optimizer single table estimations which are crucial inputs for the join planning process. • Considerations: • Increased stat intervals will not change current procedures for the collection/dropping of stats or affect the associated collection timings. However, Help Statistics will be expanded to include the additional statistics intervals for display purposes.
Performance:Extrapolate Statistics Outside of Range • Description: • Enhance the Teradata Optimizer to include a new extrapolation technique specifically designed to more accurately provide for a statistical estimate fordate range-based queries that specify a “future” date that is outside the bounds of the statistics that have been collected for the column. • Benefit: • The Optimizer extrapolation technique for date range-based queries that supply “future” dates will result in better query plans due to the fact that cardinality estimation will be much more accurate. (Because of the new extrapolation formula it is also possible that statistics for the associated date columns would not have to be re-collected as often.) • Considerations: • Extrapolation for date range-based queries will not change the procedure for dropping or collecting statistics nor will the help statistics features be affected. However, the information displayed within a query Explain plan will change because of the new numbers for estimated rows.
Performance:Collect Stats for Multi-Column With NULL Values • Description: • Enhance the internal statistics columns to accurately count and identify “all row” & partial NULL columns in multi-column statistics. • Benefit: • The Optimizer will give better estimates by understanding skew. • The inclusion of more accurate row identification and counting relative to row uniqueness will enhance the estimation the number of distinct hash values during a redistribution operation. • Considerations: • Collecting statistics for multi-column values will not change current procedures for the collection/dropping of statistics or affect the associated collection timings.
Performance:Collect AMP Level Statistics Values • Description: • Enhance Teradata statistics by calculating the exact actual average AMP-local RPV (Rows Per Value) by averaging the RPV from each AMP. • Benefit: • The new calculation formula replaces the existing probability model based calculation which is often underestimated. As a result, the cost estimation calculations for nested joins become much more accurate. • Considerations: • Addition of this feature will have little impact on the statistics collect process and will not affect collection timings. The permanent space used by the data dictionary will increase slightly (by few bytes) and Help Statistics is enhanced to display the Average AMP RPV column and data.
Performance:Parameterized Statement Caching Improvements • Description: • This feature exposes the actual query USING values for parameterized requests during plan generation so the Optimizer can generate a better plan. • Benefit: • The decision by the Optimizer as to whether or not to use a cached plan or to generate a different plan will ultimately lead to better query performance for parameterized requests. • Considerations: • This feature will also resolve the system value CURRENT_DATE before the optimizer phase to the actual date. The generated plan will be cached along with the specific date for which it is valid.
Enterprise Fit:Java Stored Procedures • Description: • Provide the database user with a means to define an external stored procedure (XSP) written in the Java language which can use JDBC to dynamically execute SQL within the same session. • Benefit: • Java applications will now be able to access data from the Teradata database directly. This feature leverages the ever-present Java skills in our customer base. • Considerations: • The current intent is to rely on Windows and Linux as the platform operating systems for this feature. Support for MP-RAS is not planned.
Active Enabled: Bulk SQL Error Logging Tables • Description: • Provide support for complex error handling capabilities during bulk SQL Insert, Update and Delete operations through the use of new SQL-based error tables. • Benefit • Complementary capability to using native Teradata utilities, this feature increases flexibility/opportunity in developing load strategies by allowing SQL to be used for batch updates that contain errors, provide error reporting similar to current load utilities while overcoming current restrictions on having unique indexes, join indexes and triggers resident on target tables. • Considerations • Considerationshould be given to re-evaluating current batch load/ETL processes to take advantage of bulk SQL load operations that are not currently considered due to current limitations/restrictions.
Active Enabled: Online Archive • Description: • Online archive allows the archival of a running database, i.e. in conjunction with concurrently executing update transactions for the tables in the database. Transactional consistency is maintained by tracking any changes to a table in a log such that changes applied to the table during the archive can be rolled back to the transactional consistency point after the restore. • Benefit: • Online archive removes the requirement of having a “window” where updates must be held up while backup procedures are executed. Additionally, object locking will be eased and the full-performance impact of permanent journals will be removed. • Considerations: • Online archive will be integrated into the Open Teradata Backup (OTB) suite of products associated with this release.
Active Enabled: Full ANSI Merge-Into SQL Capability • Description: • Enhance the Merge-Into SQL capability to support full ANSI functionality. This feature will allow the database to perform a true bulk UPSERT operation with a standard SQL statement. Additionally, this enhancement also provides for the non-ANSI extensions to support additional error-handling capabilities. • Benefits: • The new SQL Merge functionality lifts the current restriction of only supporting single-row merges and will allow multiple table rows to be processed in this fashion. Bulk UPSERT processing capability will no longer be limited to the Multiload utility and the extended error-handling capabilities will allow native SQL to become usable in load strategy scenarios and simultaneously overcoming current utility restrictions regarding unique indexes, join indexes and triggers resident on target tables. • Considerations: • Consideration should be given to re-evaluating current batch load/ETL processes to take advantage of full ANSI Merge-Into SQL capability for load operations that are not currently considered due to current limitations/restrictions
Enterprise Fit:Custom Password Dictionary Support • Description: • Implement a new password control rule that affects the creation and modification of passwords that optionally restricts a password from being created/modified that contains any word listed in a custom system administrator modifiable dictionary. • Benefit: • Meet the need for increased database logon security by ensuring that users are required to select and construct strong passwords and to provide for the enforcement of security rules. • Considerations: • A list of common words and names in English will be used to initialize the dictionary table. This table can be localized/internationalized with a new language equivalent or by adding new localized words to the existing list of English words.
Enterprise Fit:New Password Encryption Algorithm • Description: • This DBS Password Enhancement feature completes the full implementation of the Secure Hashing Algorithm (SHA) SHA-256 bit encryption for passwords. • Benefits: • User passwords are now stored and verified using the 64-bit DES encryption algorithm. In order to fulfill stronger customer security needs the more advanced and secure 256-bit SHA-256 algorithm has been chosen as a replacement for DES. • Considerations: • Support for the two forms of password encryption from previous releases, namely DES and SHA-256 truncated to 27 bytes will continue. All passwords created on old releases will continue to work and will be changed to full SHA-256 encryption when next modified. This feature operates the same on all platforms.
Enterprise Fit:UNIX/Linux Kerberos Authentication for Win Clients • Description: • A Windows client-based user – having authenticated to a Windows domain – can log onto Teradata UNIX/Linux without having to supply a username and password….i.e. single sign-on. • Benefit: • Single sign-on systems provide a single authentication service that allows users to logon to a network once and then transparently access each application and service to which they have been authorized; fewer user names and passwords have to be managed and remembered. • Eases the burden of security administrators. They can manage users and privileges in a centralized repository where changes can be made quickly and easily, and take effect immediately.
Ease of Use:Additional Explain Plan Details • Description: • Enrich the content of SQL explain plans by additional information to the explain output including spool size estimates, view names and actual column names for Hashing, Sorting or Grouping columns. • Benefits: • The enhancing of explain plan details facilitates explain output readability and understanding as well as aids in the debugging of complex queries and for identifying intermediate result spool skewing. • Considerations: • There is no special mechanism needed to acquire enhanced EXPLAIN plan details. A simple “Explain SQL” statement will generate all the aforementioned features.
Ease of Use … TASM: Dynamic Load Utility Management • Description: • TDWM is enhanced to provide control over certain load utility operations, that include: • A throttle rule for Arc/Restore jobs to enforce concurrency limits. • A classification rule for Arc/Restore jobs to assign to appropriate Workload Definitions. • Automatic delay of utility jobs when a concurrency limit is exceeded. • More information about load utility sessions is provided to the DBQL for workload analysis and management. • Benefits: • Aids in capacity planning and system utilization reporting. • Enables better management of mixed workloads to allow critical work to complete.
Ease of Use:SQL Invocation via External Stored Procedures • Description: • Extend the current External Stored Procedure (XSP) capability to provide an interface that allows an XSP to invoke and use SQL in the current session. • Benefits: • This feature will foster greater application development and enhance the ability of a client application to access and use the Teradata database directly.
Ease of Use…TASM:Query Banding • Description: • A set of Name/Value pairs that can be set on a Session and Transaction to identify the query's originating source. • Benefit: • Enables better Classification into Workload Definitions • More accurately identifies the query’s originating source and purpose. • Enables better Accounting • Increased granularity regarding request source buckets. • E.g. Not just a CRM query, but a specific area and/or user who generated it. • Enables grouping of queries into their “jobs”. • E.g. To generate a single “report”, MSI issues several SQL requests, but would like to account & manage those requests as a single entity. • Makes the request-generating application an integral part of workload management. • The application has a role in specifying who gets priority on the DBMS. • Extensibility
Ease of Use: Stored Procedure Result Sets • Description: • Provide the functionality that allows Stored Procedures to build, use, and return answer sets as a result of their execution. • Benefit: • Extending the Stored Procedure capability greatly simplifies application development against the Teradata database. • Considerations: • Currently, without the Stored procedure Result Set capability, temporary tables need to be created and used to store answer sets and follow the Stored Procedure CALL with a SELECT statement. • Consideration should be given to removing these intermediate steps from current applications. • JAVA SPs will not be able to return result sets.
Ease of Use…TASM:Enhanced Data Collection for DBQL • Description: • More information is added to the DBQL, including: • Logging Parsing Engine CPU time • Reporting the high and low AMP byte counts for spool • Cost estimates (CPU, I/O, network, heuristics) • Estimate processing time and row counts • More information on utilities • New and improved format for QueryID • Benefits: • DBQL is a valuable tool for query analysis, including capturing the SQL and resource usage. • The added information in the DBQL enables deeper query analysis for optimizing performance.
Ease of Use:Index Wizard Support for PPI • Description: • The enhanced Index Wizard will now provide support for PPI recommendations, incorporate UPDATE cost into index selection process and introduce a new TIMELIMIT option. • Benefits: • Provides the database community with assistance in defining PPIs on tables that greatly enhance query performance, supplies recommendations that consider index update related costs and presents the capability to end index analysis at a specified time limit and reports best recommendations up to that point. • Considerations: • The Index Wizard tool should be considered for use in situations where users are unsure of what kind/type of Index should be used/instantiated.
Ease of Use:Dynamic Result Row Specifications on Table UDFs • Description: • This feature extends current table function capability to support result row specification at run time. The user can specify the maximum number of output columns that a table function will ever want/need (within limits). • The actual output parameter name and its data type can be defined in the SELECT statement where a table function is referenced. • Benefits: • This feature provides flexibility and ease of use in that a table function can be used directly to feed data into a user table of variable columns without further modification to the table function. • Considerations: • The number of columns in the table must be within the pre-defined maximum number of output columns for the table function, and the data types and the order of data types are ones supported by the particular implementation of the table function.
TW 12.0 PS Service Offers Monitoring Administration Operational Control Performance Management WorkloadManagement ApplicationPerformance Data Collection Assessment CapacityPlanning • Dual Active Implementation Services (DAIS 12.0) • Minor Update • Dual Active Workshop • Requirements and Design • Solution Implementation Teradata System A Users/Applications Data Synchronization Teradata MultiSystems Manager TDAS Sync Solutions: ETL Vendor, Dual Load, Cross Feed, Tbl Shipping, Etc. Teradata QueryDirector Users/Applications Teradata QueryDirector Replication Users/Applications Teradata System B • Performance and Capacity Services • Major Update • Performance Management Assessment • Data Collection and Reporting • Workload Management • Application Performance • Capacity Planning