290 likes | 420 Views
Jack Parker Arten Technology Group. Calisthenics. How many here are Informix users? How many here are HPL users?. Why use HPL. It’s Fast It’s Flexible. Previous (and still extant) Loaders. LOAD 1GB/Hr. 2GB Limit. Long Transaction heaven. dbload 1GB/Hr. 2GB Limit. dbimport
E N D
Jack Parker Arten Technology Group
Calisthenics • How many here are Informix users? • How many here are HPL users?
Why use HPL • It’s Fast • It’s Flexible
Previous (and still extant) Loaders • LOAD • 1GB/Hr. 2GB Limit. Long Transaction heaven. • dbload • 1GB/Hr. 2GB Limit. • dbimport • 1GB/Hr. 2GB Limit. Entire database • Insert Cursor • 2GB/Hr. 2GB Limit. Complex. • High Performance Loader – Faster. More flexible. • See Load FAQ at www.artentech.com/downloads.htm
Why is it faster? • Other loaders • Single threaded • Use Buffers (the Wall) • HPL • Multi-threaded • Avoids Buffers • Light Scans / Light Appends
Light Scans and Light Appends • Light Scans use their own buffer pool and avoid the normal buffers • Light Appends write to their own disk and append the results if successful • Light scans occur when: • Table is bigger than buffers • Dirty Read • No Varchars
In reality… • HPL performance is dependent not only on CPU, but on disk I/O • The more disk you can put into the equation, the faster it will run. • My tests were run on a • Solaris 2 CPU machine • 1 disk • 31MB of RAM for Informix
More disk • Spread input/output files across filesystems (create three or four large data areas just for this). • Fragment tables across multiple dbspaces. (up to 3 per CPU, but at least 100MB/dbspace) • Size tables properly
Unload Metrics • Simple unload 2.4 GB/CPU/Hr • Wait a minute • Update statistics • Simple unload 4.9 GB/CPU/Hr
Flexible • Load or Unload • Disk • Tape • Pipe
This means that you can… • Load straight from (or to) disk • Prepare data to a pipe and Load • Unload to gzip (pipe) • Load from gzip (pipe) • Unload to another HPL load process • Or Pload • Even on another machine
Unload to Gzip Load from Gzip Internal format to Gzip Internal from Gzip HPL “Insert into table2 select * from table1” (vs. normal insert/select) 6 GB/CPU/HR 3 GB/CPU/HR 11.5 GB/CPU/HR 7.5 GB/CPU/HR 4.5 GB/CPU/HR .7 GB/CPU/HR A quick look at some of those:
You can use HPL for all sorts of things it wasn’t intended for • As a backup tool • As a reorg tool • As a data movement/migration tool • As an ETL tool • For more thoughts on this check out Raj Murali’s talk on ETL within the database (XPS) D31 • also see the IBM Developer zone DSS Application Processing http://www7b.boulder.ibm.com/dmdd/zones/informix/library/techarticle/parker/0502parker.html
Flexible on the inside too. • A “Natural” interface • ipload to start • $DISPLAY • xhost • INFORMIXSERVER=tcp connection (for some)
Format Specification
Query Specification
Other • $DBONPLOAD • $PLCONFIG • Create your own functions and link them in • onpladm
Summary The High Performance Loader is a powerful tool which can be used wherever you need to move a lot of data around efficiently. With this tool you can replace a lot of old and slow code and realize significant performance improvements. Best of all? It’s already on your machine.