370 likes | 641 Views
Netezza’s Deep Dive: Getting Your Data Warehouse Up and Running in 24 hours. Chi-Chung Hui Consulting I/T Specialist Information Management Software, IBM HK. Simplicity, Flexibility, Choice IBM Data Warehouse & Analytics Solutions. Custom Solution. True Appliance.
E N D
Netezza’s Deep Dive: Getting Your Data Warehouse Up and Running in 24 hours Chi-Chung Hui Consulting I/T Specialist Information Management Software, IBM HK
Simplicity, Flexibility, ChoiceIBM Data Warehouse & Analytics Solutions Custom Solution True Appliance Flexible Integrated System Netezza IBM Smart Analytics System IBM InfoSphere Warehouse Warehouse Accelerators Information Management Portfolio (Information Server, MDM, Streams, etc) Simplicity Flexibility The right mix of simplicityand flexibility 2
Netezza Value Proposition • Speed: Price/performance leader using hardware-based data streaming • Simplicity: Black-box appliance with no tuning or storage administration provides low TCO and fast time to value • Scalability: True MPP enables customers to conduct rapid queries and analytics on petabyte sized data warehouses • Smart: Built-in advanced analytics pushed deep into database delivers analytics to the masses
Netezza and ISAS Choose Netezza: • If the best price/performance is required • If customer cannot afford too much tuning and administration • If customer need the fastest time to value • If customer does not want to pay for a separate database software license Choose ISAS: • If AIX is preferred • If SAN and remote mirroring are required • If customer requires the warehouse to conform to the data center infrastructure standard • If customer likes a more customized warehouse • If customer need very specific/deep tuning techniques
Agenda • Netezza Solution Highlight • What is it? • Why it is good? • Netezza Tour • Specialized hardware and architecture • How Netezza operates in running database queries? • Netezza Simplicity • Netezza Performance
Netezza – Solution Highlight Summary • True Appliance • Hardware, software and storage pre-built for data warehouse • With specially designed hardware designed for high-performance advanced analytics operations • Hardware compression based on table columns • Very fast • Usually 10x to 100x faster than traditional database • Minimal administration and tuning • Low TCO
Legacy DWH Architectures:Moving large amounts of data becomes Bottleneck!! Large amounts of data moved from disk, causing bottleneck Data Results Query Server RDBMS SW Storage Data is moved to memory, then SQL processed
Netezza Performance ServerWe are better in EDW operations with complex BI queries! Netezza Performance Server™ CPU: 2% of existing systems Network traffic:1% of existing systems Results Query SMP Host (2-4 CPU) Data processed as streams from disk, before moved to memory
Agenda • Netezza Solution Highlight • What is it? • Why it is good? • Netezza Tour • Specialized hardware and architecture • How Netezza operates in running database queries? • Netezza Simplicity • Netezza Performance
The IBM Netezza TwinFin™ Appliance Slice of User Data Swap and Mirror partitions High speed data streaming Disk Enclosures SQL Compiler Query Plan Optimize Admin SMP Hosts Snippet Blades™ (S-Blades™) Processor & streaming DB logic High-performance database engine streaming joins, aggregations, sorts, etc. Page 11
S-Blade™ Components SAS Expander Module SAS Expander Module Dual-Core FPGA DRAM Intel Quad-Core Netezza DB Accelerator IBM BladeCenter Server
The IBM Netezza AMPP™ Architecture FPGA CPU Advanced Analytics Memory Host Hosts BI ODBC/JDBC FPGA CPU Memory ETL Loader FPGA CPU Memory Applications Disk Enclosures Network Fabric S-Blades™ Netezza Appliance
Our Secret Sauce select DISTRICT, PRODUCTGRP, sum(NRX) from MTHLY_RX_TERR_DATA where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO' FPGA Core CPU Core Restrict, Visibility Complex ∑ Joins, Aggs, etc. Uncompress Project Slice of table MTHLY_RX_TERR_DATA (compressed) sum(NRX) select DISTRICT, PRODUCTGRP, sum(NRX) where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO'
Netezza Eliminates the I/O Bottleneck Move the SQL to the hardware… to where the data lives “Just send the Answer, not Raw Data”
Agenda • Netezza Solution Highlight • What is it? • Why it is good? • Netezza Tour • Specialized hardware and architecture • How Netezza operates in running database queries? • Netezza Simplicity • Netezza Performance
Why traditional database systems are not enough: Endless tuning Query performance is slow business person 19
Why traditional database systems are not enough: Endless tuning I’ll add an index business person technical person 20
Why traditional database systems are not enough: Endless tuning Load performance is slow. When can I access my data? business person 21
Why traditional database systems are not enough: Endless tuning I’ll investigate and get back to you … business person technical person 22
Why traditional database systems are not enough: Endless tuning Okay… I will add an aggregate table to pre-calculate so that the report will run faster. business person technical person 23
Why traditional database systems are not enough: Endless tuning I want my report to be refreshed every 1 hour. business person 24
Why traditional database systems are not enough: Endless tuning Oh… that is impossible… The report will be updated once everyday after night batch… business person technical person 25
Why traditional database systems are not enough: Wasted effort 26
Solving the data load and query performance problem “ “ We act out the market every day to capitalize on opportunities. Complex merchandize reports that had taken days to process on the old platform now take five minutes on the new one. Simpler queries are even faster. -- Chief Information Officer at a large US retailer 27
Netezza is Simple to Deploy Since it is so Fast Page 29 • Operations • Simply load and go .… it’s an appliance • Minimal DBA Tuning • No configuration or physical modeling • No indexes– out of the box performance • ETL Developers • No aggregate tables needed ->Less ETL logic • Faster load and transformation times • Business Analysts • Train of thought analysis – 10 to 100x faster • True ad hoc queries – no tuning, no indexes • Ask complex queries against large datasets
Traditional Complexity … Netezza Simplicity 0. CREATE DATABASE TEST LOGFILE 'E:\OraData\TEST\LOG1TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG2TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG3TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG4TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG5TEST.ORA' SIZE 2M EXTENT MANAGEMENT LOCAL MAXDATAFILES 100 DATAFILE 'E:\OraData\TEST\SYS1TEST.ORA' SIZE 50 M DEFAULT TEMPORARY TABLESPACE temp TEMPFILE 'E:\OraData\TEST\TEMP.ORA' SIZE 50 MUNDO TABLESPACE undo DATAFILE 'E:\OraData\TEST\UNDO.ORA' SIZE 50 M NOARCHIVELOG CHARACTER SET WE8ISO8859P1; 1. Oracle* table and indexes 2. Oracle tablespace 3. Oracle datafile 4. Veritas file 5. Veritas file system 6. Veritas striped logical volume 7. Veritas mirror/plex 8. Veritas sub-disk 9. SunOS raw device 10. Brocade SAN switch 11. EMC Symmetrix volume 12. EMC Symmetrix striped meta-volume 13. EMC Symmetrix hyper-volume 14. EMC Symmetrix remote volume (replication) 15. Days/weeks of planning meetings Netezza: Low (ZERO) Touch: CREATE DATABASE my_db;
Netezza Delivers Simplicity Up and running 6 months before being trained 200X faster than Oracle system ROI in less than 3 months “ “ Allowing the business users access to the Netezza box was what sold it. -- Steve Taff, Executive Dir. of IT Services 31
Agenda • Netezza Solution Highlight • What is it? • Why it is good? • Netezza Tour • Specialized hardware and architecture • How Netezza operates in running database queries? • Netezza Simplicity • Netezza Performance
POC - A Telco Company • Environment • Netezza TwinFin 12 full rack • Raw Data volume • Call Level Detail : 3TB (9 billion rows) • Financial Bill : 600GB (5.4 billion rows) • Customer Info : 60GB (91.1 million rows)
Catalina Marketing: Building loyalty one customer at a time • Marketing to a segment of one – 195 million US loyalty program members • Every coupon printed is unique to the individual customer • Customized based on three years' worth of purchase history • Increased staff productivity – from 50 to 600 new models per year • Increased efficiency – from 4 hours to score a model to 60 seconds 36
Netezza’s Deep Dive: Getting Your Data Warehouse Up and Running in 24 hours Chi-Chung Hui Consulting I/T Specialist Information Management Software, IBM HK