TPC-H Queries on Teradata and PostgreSQL: Database Tuning and Performance Optimization

Project Partners: Amreek Singh (02329025) Chetan Vaity (02329901) Some TPC-H queries on Teradata and PostgreSQL

Motivation • Usage of real Database Systems • Gain some experience in database tuning • Work with Teradata machine in SIT Test Setup • Twin processors with 2GB RAM, proprietary parallel storage system • Windows 2000 Advanced Server • Teradata v4.1.2 • Twin Xeon processors with 2GB RAM, RAID 5 • Linux Kernel version 2.4.18-10smp • PostgreSQL v7.2.1

TPC-H Schema Part (200K rows) Order(1500K rows) Partsupp(800K rows) Lineitem(6000K rows) Supplier(10K rows) Customer(150K rows) Nation(24 rows) Region(5 rows) • A typical manufacturing concern database • Approximately 1GB of data

TPC-H Query 2 Supplier Part(p_size=15)(p_type=‘%BRASS’) Partsupp Region(r_name=‘EUROPE’) Region Nation Part SELECT s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment FROM part, supplier, partsupp, nation, region WHERE p_partkey = ps_partkey AND s_suppkey = ps_suppkey AND p_size = 15 AND p_type like '%BRASS' AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'EUROPE' AND ps_supplycost = ( SELECT min(ps_supplycost) FROM partsupp, supplier, nation, region WHERE p_partkey = ps_partkey AND s_suppkey = ps_suppkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'EUROPE' ) ORDER BY s_acctbal desc, n_name, s_name, p_partkey; Teradata Query Plan

Part(p_size=15)(p_type=‘%BRASS’) Partsupp Part Supplier Region(r_name=‘EUROPE’) Nation Region • Analysis of query execution plans of both systems • Added indexes (B-Tree indexes on all) • Rewrote the query using “explicit join” clause • Reduced query time from 40 minutes to 2 seconds PostgreSQL Query Plan

Query execution times After adding secondary index on n_nationkey on supplier table After Collect statistics

Conclusion: Query plans are very useful in database tuning Parallel architecture under full DBMS control performs Bibliography http://www.tpc.org PostgreSQL Documentation Teradata Documentation Database Tuning, Dennis Shasha

TPC-H Queries on Teradata and PostgreSQL: Database Tuning and Performance Optimization

TPC-H Queries on Teradata and PostgreSQL: Database Tuning and Performance Optimization

Presentation Transcript

PostgreSQL

Some Recursive Queries

Postgresql

C-Store: Introduction to TPC-H

Common Queries (and hopefully some answers)

PostgreSQL

Running TPC-H On Pig

Managing PostgreSQL on Windows

Some statistical studies on H  

PostgreSQL

Some Additional TPC Aspects to Consider

Surviving C and PostgreSQL