1 / 40

Teradata Platform Introduction

Teradata Platform Introduction. Hardware and Software Components in Enterprise Data Warehouse Derek Jones March 2005. Teradata in the Enterprise. Teradata is relational database management system Acts as central enterprise-wide database

ebravo
Download Presentation

Teradata Platform Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Teradata Platform Introduction Hardware and Software Components in Enterprise Data Warehouse Derek Jones March 2005

  2. Teradata in the Enterprise Teradata is relational database management system • Acts as central enterprise-wide database • Contains information extracted from operational systems • Central placement minimizes data duplication and provides single view of business

  3. Key Teradata Differentiators • Parallelism throughout platform • Shared Nothing Architecture • Proprietary intelligent system inter-connect

  4. Teradata Scales Linearly • Scaling achieved via ‘shared nothing’ architecture and unconditional parallelism • Power is in linear scalability, where slope = 1 • Scales with data • Scales with users • Scales with work More nodes More work More users More data Node Work Users Data

  5. The Teradata Difference“Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency Data Freshness Query Complexity Query Freedom Schema Sophistication Query Data Volume

  6. The Teradata Difference“Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency Competition can be Tuned to Meet a Static Environment Business Needs Change Data Freshness Query Complexity Schema Sophistication Query Freedom Query Data Volume

  7. The Teradata Difference“Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency Competition can be Tuned to Meet a Static Environment Business Needs Change Desire to Increase User/ Query Concurrency Data Freshness Query Complexity Competition Scales One Dimension at the Expense of Others But At the Expense of Another Dimension Schema Sophistication Query Freedom Query Data Volume

  8. The Teradata Difference“Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency Teradata can Scale Simultaneously Across Multiple Dimensions Driven by Business! Competition Scales One Dimension at the Expense of Others Limited by Technology! Data Freshness Query Complexity Schema Sophistication Query Freedom Query Data Volume

  9. Key Teradata Differentiators • Parallelism throughout platform • Shared Nothing Architecture • Proprietary intelligent system inter-connect

  10. Node Architecture (‘Shared Nothing’) Each Teradata Node is made up of hardware and software • Each node has CPUs, system disk, memory and adapters • Each node runs copy of OS and database SW

  11. Node Architecture (‘Shared Nothing’) PE vproc PE vproc AMP vproc AMP vproc AMP vproc AMP vproc AMP vproc AMP vproc AMP vproc AMP vproc V2 Virtual Processors (Vprocs) Vdisk Vdisk Vdisk Vdisk Vdisk Vdisk Vdisk Vdisk PDE UNIX Each Teradata Node is made up of hardware and software • Each node runs copy of OS, database SW, & virtual processes (above line) • Each node has CPUs, system disk, memory & adapters (below line)

  12. NCR 5400 Server Value Prop • Better Price/Performance • 20% Performance Improvement • 12% Price/Performance Improvement • Advanced Cabinet Design • Up to 10 Nodes Per Cabinet • Up to a 40% Reduction in Floor Space • Investment Protection • Multi Generation (5) Coexistence • 32-bit/64-bit Transition Platform

  13. NCR 5400 Server Key Messages#2 – Advanced Cabinet Design 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 1 3 Ethernet Switches BYNET V3 Switches FC Switches Up to 10 nodes within each cabinet Server Management Module (3GSM) Five UPS Modules • Revolutionary cabinet increases reliability and provides greater configuration flexibility. • up to 10 nodes per cabinet enable a 20% - 40% smaller footprint than the 5380 • 30% increase in system storage reliability with new advanced cooling mechanisms • Extend supported distance for large systems (65+ nodes) between cabinets to 300 – 600 meters with new BYNET V3. • Doubles the number of configurable nodes to 1,024

  14. Key Teradata Differentiators • Parallelism throughout platform • Shared Nothing Architecture • Proprietary intelligent system inter-connect

  15. Parallelism via BYNET Interconnect Vproc to Vproc Broadcast (1 to All) Vproc to Vproc Multicast (1 to Many) BYNET high-speed interconnect facilitates system communication • All nodes connected via BYNET • Hardware network • Software runs on each node Different communication paths facilitate system parallelism • 1 to 1 • 1 to Many • 1 to All

  16. MPP System Configuration Nodes grouped to increase data availability and system uptime • Not shared storage but access within group • Improves data availability • Improves system up time • Allows for VPROC migration

  17. Teradata Clique Node Node Node Node Disk Array Disk Array Disk Array Disk Array = VPROC Clique is group of nodes that access same arrays • VPROC smallest unit of parallelism • VPROC has assigned storage within clique • VPROCs can migrate within clique • Improves system up time, data availability, and ease of recovery

  18. Teradata Clique and VPROC Node Node = VPROC VPROC smallest unit of parallelism • VPROC smallest unit of parallelism or work • Data distributed by hash to all VPROCs • VPROC has assigned storage within clique • VPROCs can migrate within clique • Improves system up time, data availability, and ease of recovery • Data fully available at degraded performance until node returns. Node Node X Disk Array Disk Array Disk Array Disk Array

  19. Teradata Clique with Hot Standby Node Node Node Node Node Node Node Hot Standby X Fibre Channel Switches Disk Array Disk Array Disk Array Disk Array Disk Array Disk Array

  20. Teradata Optimizer • The Teradata Optimizer is the most robust in the industry • Optimizer is parallel-aware, understands available system components • Handles mixed work loads • Multiple complex queries • Joins per query • Unlimited ad-hoc processing • Output is least expensive plan (resources) to answer request

  21. Teradata Request Cycle REQUEST Parcel Yes CACHED? DBase AccessRights TVM TVFields Indexes DD No SYNTAXER RESOLVER SECURITY STATISTICS OPTIMIZER GENERATOR DATA parcel GNCAPPLY AMP STEPS Request flow diagram • Each request parcel contains at least one SQL statement • Six main component steps • Syntaxer • Resolver • Security • Optimizer • Generator • gncApply • AMP steps are instructions sent to AMP VPROCs to complete the request • Following completion each request generates a success/fail parcel with any necessary records.

  22. Data Protection (Object Locks) Locks protect data from simultaneous access • Vary by type • Exclusive, Write, Read, & Access • Vary by object locked • Database, Table, & Row Hash • Locks enforced by hierarchy

  23. Data Protection (RAID-1) RAID data protection • RAID-1 (disk mirroring) • Disk pair increases read performance and data availability • In failure scenario, mirrored drive re-built by array controller

  24. Data Protection (Fallback) • Fallback table data • Copy of table rows maintained by database on second AMP VPROC • Fallback copies grouped logically in CLUSTERS so data fully available when physical CLIQUE is off-line. • Fallback + RAID increase data availability

  25. Data Storage and Access Primary Index value = 25 Parsing Engine Hashing Algorithm Row Hash Bucket # Hash Map Message Passing Layer AMP AMP AMP AMP 25 Data stored by hash • Primary Index is chosen for data distribution, not same as primary key • Primary Index value hashed • Hash value creates bucket assignment • Hash Map assigns buckets to AMP VPROCs • AMP VPROCs reside on specific node • AMP VPROC writes row to disk • Data and algorithm exceptions require Uniqueness value for guaranteed unique Row ID

  26. Data Access by Primary Index Table-id Row-hash AMP #3 M a s t e r I n d e x Cyl 1 Index Cyl 2 Index Cyl 3 Index Cyl 4 Index Cyl 5 Index Cyl 6 Index Cyl 7 Index Cylinder # PI Value DATA BLOCK Data Row Data Row Data accessed by row hash value Need 3 pieces of information to find a row • Table ID • Row Hash of PI value • Output of hash algorithm on PI Value • PI Value • Operation involves only one AMP VPROC

  27. Unique Secondary Index (USI) Access CREATE UNIQUE INDEX (cust) on customer; Create USI SELECT *FROM customerWHERE cust = 56; Access via USI Data Access by Unique Secondary Index (USI) Hashing Algorithm Table ID Index Value Row Hash 100 56 602 Message Passing Layer Message Passing Layer AMP 1 AMP 2 AMP 3 AMP 4 USI Subtable USI Subtable USI Subtable USI Subtable RowID Cust RowID 135, 1 98 555, 6 296, 1 84 536, 5 602, 1 56 778, 7 969, 1 49 147, 1 RowID Cust RowID 175, 1 37 107, 1 489, 1 72 717, 2 838, 4 12 147, 2 919, 1 62 822, 1 RowID Cust RowID 244, 1 74 884, 1 505, 1 77 639, 1 744, 4 51 915, 9 757, 1 27 388, 1 RowID Cust RowID 288, 1 31 638, 1 339, 1 40 640, 1 372, 2 45 471, 1 588, 1 95 778, 3 AMP 1 AMP 2 AMP 3 AMP 4 Row Hash Unique Val Table ID 778 100 7 Base Table Base Table Base Table Base Table RowIDCust Name Phone USI NUPI 107, 1 37 White 555-4444 536, 5 84 Rice 666-5555 638, 1 31 Adams 111-2222 640, 1 40 Smith 222-3333 RowIDCust Name Phone USI NUPI 147, 1 49 Smith 111-6666 147, 2 12 Young 777-4444 388, 1 27 Jones 222-8888 822, 1 62 Black 444-5555 RowIDCust Name Phone USI NUPI 471, 1 45 Adams 444-6666 555, 6 98 Brown 333-9999 717, 2 72 Adams 666-7777 884, 1 74 Smith 555-6666 Customer Table ID = 100 USI Value = 56 PE USI Data access • Index is created on table • SQL uses USI by value • PE VPROC managing session uses same information as primary index access (Table ID, Row Hash, Index Value) • This process involves two AMP VPROC operations • After USI subtable lookup, process similar to primary index access RowIDCust Name Phone USI NUPI 639, 1 77 Jones 777-6666 778, 3 95 Peters 555-7777 778, 7 56 Smith 555-7777 915, 9 51 Marsh 888-2222

  28. CREATE INDEX (name) on customer; Create NUSI Hashing Algorithm SELECT *FROM customerWHERE name = ‘Adams’; Access via NUSI Data Access via Non-uniqueSecondary Index (NUSI) Table ID Index Value Row Hash 100 567 ‘Adams’ AMP 1 AMP 2 AMP 3 AMP 4 Message Passing Layer NUSI Subtable NUSI Subtable NUSI Subtable NUSI Subtable RowID Name RowID 432, 8 Smith 640, 1 448, 1 White 107, 1 567, 3 Adams 638, 1 656, 1 Rice 536, 5 RowID Name RowID 432, 3 Smith 884,1 567, 2 Adams 471,1 717,2 852, 1 Brown 555,6 RowID Name RowID 432, 1 Smith 147, 1 448, 4 Black 822, 1 567, 6 Jones 338, 1 770, 1 Young 147, 2 RowID Name RowID 155, 1 Marsh 915, 9 396, 1 Peters 778, 3 432, 5 Smith 778, 7 567, 1 Jones 639, 1 AMP 1 AMP 2 AMP 3 AMP 4 Base Table Base Table Base Table Base Table RowIDCust Name Phone NUSI NUPI 107, 1 37 White 555-4444 536, 5 84 Rice 666-5555 638, 1 31 Adams 111-2222 640, 1 40 Smith 222-3333 RowIDCust Name Phone NUSI NUPI 147, 1 49 Smith 111-6666 147, 2 12 Young 777-4444 388, 1 27 Jones 222-8888 822, 1 62 Black 444-5555 RowIDCust Name Phone NUSI NUPI 471, 1 45 Adams 444-6666 555, 6 98 Brown 333-9999 717, 2 72 Adams 666-7777 884, 1 74 Smith 555-6666 RowIDCust Name Phone NUSI NUPI 639, 1 77 Jones 777-6666 778, 3 95 Peters 555-7777 778, 7 56 Smith 555-7777 915, 9 51 Marsh 888-2222 Customer Table ID = 100 NUSI Value = ‘Adams’ PE • Index is created on table • SQL uses NUSI by value • PE VPROC managing session uses same information as primary index access (Table ID, Row Hash, Index Value) • This process involves all-AMP VPROC operations

  29. Teradata Structures Database structures • Users • Databases • Tables • Views • Macros • Triggers • Stored Procedures • User Defined Functions

  30. Teradata is an Open System Virtually any application or middleware framework can be integrated with Teradata. Messages Web JSM JSP IIOP ASP EJB JAVA CORBA .NET JDBC JDBC ODBC OLE-DB Message Bus Teradata Teradata Utilities Adapter(s) Publish & Subscribe Teradata Utilities Adapter(s) Queues

  31. 64-bit Teradata SolutionTeradata on SuSE Linux 2H 2005 • 64-Bit • 32-Bit 3rd Party Partners Client-Tier Client-Tier Teradata Application • Intel • Also • IBM/Power PC • SUN/SPARC • HP/PA-RISC • DELL • HP • IBM Teradata Tools & Utilities Teradata System Mgmt 64-BIT Application Server-Tier 32-BIT Application Server-Tier Teradata Database Operating System • Linux Intel Platform 2H 2005 Intel 64-Bit Database Server-Tier • Teradata Database on Intel 32-bit and 64-bit will support both 32-bit and 64-bit applications & clients concurrently

  32. Teradata’s Real-Time Enterprise Reference Architecture Enterprise Message Bus Legacy Environment Enterprise Users — (Browsers and/or Portal) Legacy Environment C/S EDI Consumers Suppliers Internal Partners EDI C/S WAN / VAN Internet / Intranet WAN / VAN Transactional Services Analytic & Decision Making Services NW MSG-MW MSG-MW MSG-MW MSG-MW MSG-MW MSG-MW NW ASP / JSP TX1 APPL TX2 APPL TX3 APPL TX4 APPL Strategic APPL Tactical APPL BI APPL BI APPL Service Brokers DA-MW DA-MW DA-MW DA-MW DA-MW DA-MW DA-MW DA-MW QD QD MSG-MW MSG-MW MSG-MW Event Notification Business Rules Event Detection EDW — A EDW — B DA-MW DA-MW DA-MW RS OLTP1 OLTP2 OLTP3 OLTP4 RDBMS Based Event Processing Business Process Automation Streaming Batch Transactional Repositories Data Acquisition & Integration Analytic & Decision Making Repositories

  33. Transactional Services Application Scope Applications have narrow scope. Tuned for specific book-keeping or transactional services. Transactional Services Transactional Application Services Applications that perform book-keeping or transactional services for the enterprise NW MSG-MW MSG-MW MSG-MW TX1 APPL TX2 APPL TX3 APPL TX4 APPL DA-MW DA-MW DA-MW DA-MW Data Access MiddlewareOccurs via standards, such as; ODBC, OLE-DB, JDBC, as well as proprietary techniques • OLTP Data Repositories • Data that reflects the current state of various business process • Limited history • Tuned for transaction workload OLTP1 OLTP2 OLTP3 OLTP4 Transactional Repositories

  34. Transactional User Base Enterprise Message Bus Legacy Environment Enterprise Users — (Browsers and/or Portal) Transactional User Base Consumers, Suppliers, Internal, and Trading Partners C/S EDI Consumers Suppliers Internal Partners WAN / VAN Internet / Intranet Transactional Services Service Brokers J2EE, CORBA, DCOM, Web Services NW MSG-MW MSG-MW MSG-MW ASP / JSP TX1 APPL TX2 APPL TX3 APPL TX4 APPL Service Brokers DA-MW DA-MW DA-MW DA-MW User-level Integration Occursvia standard EAI services, such as JAVA, Web Sphere, .NET, Tibco, and SeeBeyond OLTP1 OLTP2 OLTP3 OLTP4 Client/Server Styles 2-Tier and 3-Tier RPC style interfaces Transactional Repositories

  35. Data Warehouse Services Application Scope Strategic and Tactical decision making applications. Though BI tools or custom applications. Analytic & Decision Making Services MSG-MW MSG-MW MSG-MW NW Strategic APPL Tactical APPL BI APPL BI APPL Application Services Applications that provide predictive analysis and assisted decision making DA-MW DA-MW DA-MW DA-MW QD QD Data Access Middleware Occurs via standards, such as ODBC, OLE-DB, JDBC, as well as proprietary techniques EDW — A EDW — B RS RDBMS Based Event Processing • Enterprise Data Warehouse • Consolidated enterprise data • Crosses multiple business domains • Integrated data model Analytic & Decision Making Repositories

  36. Decision Support User Base Enterprise Message Bus Legacy Environment Enterprise Users — (Browsers and/or Portal) EDI C/S Consumers Suppliers Internal Partners WAN / VAN Internet / Intranet Analytic & Decision Making Services DW User Base Consumers, Suppliers, Internal, and Trading Partners MSG-MW MSG-MW MSG-MW NW ASP / JSP Strategic APPL Tactical APPL BI APPL BI APPL Service Brokers DA-MW DA-MW DA-MW DA-MW Service Broker Styles J2EE, CORBA, DCOM, Web Services QD QD User-level Integration occurs via standard EAI services, such as Web Services, JAVA, .NET, Tibco, and SeeBeyond EDW — A EDW — B RS Client/Server Styles 2-Tier and 3-Tier RPC style interfaces RDBMS Based Event Processing Analytic & Decision Making Repositories

  37. Data Acquisition Services • Data Extraction • Data is extracted from OLTP systems • Partner ETL tools are frequently used here • Data Transformation Services • Data cleansing • Data transformation (normalization) • Streaming data for frequent updates • Batch data moves for bulk operations • Partner ETL tools are typically used to perform these services • Data Load • Data is loaded into EDW system using Teradata Load tools • FastLoad • MultiLoad • TPump • Data Acquisition Options • Traditional load utilities (bulk or continuous loads) • Loads through – “in-flight” Message Passing • Replication – Table level replication from source to target QD QD EDW — A EDW — B RS OLTP1 OLTP2 OLTP3 OLTP4 RDBMS Based Event Processing Streaming Batch Transactional Repositories Data Acquisition & Integration Analytic & Decision Making Repositories

  38. Event-Driven Business Processes • Business Process Automation • Event Detection • Applied Business Rules • Event Notification • Messages are passed via P2P, Web Services or Enterprise Message Bus • RDBMS Based Event Processing • Real-time events are detected through a combination of Triggers, Stored Procedures, and UDFs • Event engine performs query • Messages are passed via P2P, Web Services or Enterprise Message Bus Enterprise Message Bus QD QD MSG-MW MSG-MW MSG-MW Event Notification Business Rules Event Detection EDW — A EDW — B DA-MW DA-MW DA-MW RS OLTP1 OLTP2 OLTP3 OLTP4 RDBMS Based Event Processing Business Process Automation Streaming Batch Transactional Repositories Data Acquisition & Integration Analytic & Decision Making Repositories

  39. Application Integration Enterprise Message Bus Decision Making Applications interact with bookkeeping applications via standard Enterprise services, such as Web Services, JAVA, .NET, -or- through the use of traditional client/server technology. Transactional Services Analytic & Decision Making Services NW MSG-MW MSG-MW MSG-MW MSG-MW MSG-MW MSG-MW NW ASP / JSP TX1 APPL TX2 APPL TX3 APPL TX4 APPL Strategic APPL Tactical APPL BI APPL BI APPL Service Brokers DA-MW DA-MW DA-MW DA-MW DA-MW DA-MW DA-MW DA-MW QD QD MSG-MW MSG-MW MSG-MW Event Notification Business Rules Event Detection EDW — A EDW — B DA-MW DA-MW DA-MW RS OLTP1 OLTP2 OLTP3 OLTP4 RDBMS Based Event Processing Business Process Automation Streaming Batch Transactional Repositories Data Acquisition & Integration Analytic & Decision Making Repositories

  40. Dual Active Solution • Replication Services • Changed data capture in V2R6 • Update propagation via “GoldenGate” Analytic & Decision Making Services • Teradata Query Director • Query routing control based on business rules • Business Continuity, workload sharing MSG-MW MSG-MW MSG-MW NW Strategic APPL Tactical APPL BI APPL BI APPL DA-MW DA-MW DA-MW DA-MW QD QD • Dual Data Load • Input data stream is split into two independent load streams • Input data is filtered so that only critical data is loaded on the Secondary “Active” system Secondary “Active” system does not need to be as large as primary system EDW — A EDW — B RS RDBMS Based Event Processing Streaming Batch Data Acquisition & Integration Analytic & Decision Making Repositories

More Related