70 likes | 87 Views
This project focuses on analyzing and optimizing TPC-H queries on Teradata and PostgreSQL databases, aiming to improve database performance and gain valuable experience in database tuning. Various techniques such as adding indexes and rewriting queries will be explored to reduce query execution times. The findings and conclusions will be documented for future reference.
E N D
Project Partners: Amreek Singh (02329025) Chetan Vaity (02329901) Some TPC-H queries on Teradata and PostgreSQL
Motivation • Usage of real Database Systems • Gain some experience in database tuning • Work with Teradata machine in SIT Test Setup • Twin processors with 2GB RAM, proprietary parallel storage system • Windows 2000 Advanced Server • Teradata v4.1.2 • Twin Xeon processors with 2GB RAM, RAID 5 • Linux Kernel version 2.4.18-10smp • PostgreSQL v7.2.1
TPC-H Schema Part (200K rows) Order(1500K rows) Partsupp(800K rows) Lineitem(6000K rows) Supplier(10K rows) Customer(150K rows) Nation(24 rows) Region(5 rows) • A typical manufacturing concern database • Approximately 1GB of data
TPC-H Query 2 Supplier Part(p_size=15)(p_type=‘%BRASS’) Partsupp Region(r_name=‘EUROPE’) Region Nation Part SELECT s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment FROM part, supplier, partsupp, nation, region WHERE p_partkey = ps_partkey AND s_suppkey = ps_suppkey AND p_size = 15 AND p_type like '%BRASS' AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'EUROPE' AND ps_supplycost = ( SELECT min(ps_supplycost) FROM partsupp, supplier, nation, region WHERE p_partkey = ps_partkey AND s_suppkey = ps_suppkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'EUROPE' ) ORDER BY s_acctbal desc, n_name, s_name, p_partkey; Teradata Query Plan
Part(p_size=15)(p_type=‘%BRASS’) Partsupp Part Supplier Region(r_name=‘EUROPE’) Nation Region • Analysis of query execution plans of both systems • Added indexes (B-Tree indexes on all) • Rewrote the query using “explicit join” clause • Reduced query time from 40 minutes to 2 seconds PostgreSQL Query Plan
Query execution times After adding secondary index on n_nationkey on supplier table After Collect statistics
Conclusion: Query plans are very useful in database tuning Parallel architecture under full DBMS control performs Bibliography http://www.tpc.org PostgreSQL Documentation Teradata Documentation Database Tuning, Dennis Shasha