520 likes | 1.22k Views
PROGRESS S O F T W A R E. Engine Crew. E2530: T he OpenEdge ™ RDBMS and Linux - A Great Combination. -or- How I convinced my wife that going away and playing with Linux for a week was 'Work'. John Harlow, BravePoint Dan Foreman, BravePoint Gus Björklund, Progress. Abstract.
E N D
PROGRESS S O F T W A R E Engine Crew E2530: The OpenEdge™ RDBMS and Linux - A Great Combination -or- How I convinced my wife that going away and playing with Linux for a week was 'Work'. • John Harlow, BravePoint • Dan Foreman, BravePoint • Gus Björklund, Progress
Abstract Does Linux really work? Does it perform well? How does one pronounce "Leenux" anyway? How good is the 2.6 kernel? See how the OpenEdge 10 RDBMS and Linux make a robust, high-performance database server. Find out how easy it is to configure the system for outstanding performance, scalability and reliability. This how-to session based on the latest benchmark results will give you the latest information on using OpenEdge on Linux.
The Location • Converge again for a third 'Secret Bunker‘ expedition • This time the bunker was cleverly hidden inside a mild-mannered office building • Two team members are in the picture
The Mission • Load operating systems • Tried and true versions • Mysterious new versions • Load database software • Again, tried and true versions • Very mysterious new version • See how well they perform • Consume beer
Topics • Goals • Setup • Linux Findings • Progress Findings • Summary
Linux Goals • Evaluate the installation process • Compare the performance of the 2.4 and 2.6 Linux kernels • Compare the 2.6 anticipatory and deadline schedulers as well • Explore out-of-the-box filesystems (ext3fs, jfs, xfs, resierfs) • Explore the tools in 2.6 for tuning such workloads • Try SuSE, for a change
Progress Goals • Compare the performance of the latest Version 9 and 10 releases (9.1d and 10.0a) • Compare the performance of Type 1 and Type 2 storage areas and various storage area configurations • Analyze the performance of (unreleased) 10.0b • Generally fiddle with various settings to measure their impact on the process.
Setup - The Hardware • 3 Dell PowerEdge 6600 • quad Xeon 1.4ghz • 4gb RAM • 8 * 73gb Ultra3 SCSI 10k RPM disks • PERC 3/DC two-channel raid controller
Setup – The Operating Systems • SLES 8.0 (SuSE Linux Enterprise System) • 2.4.19 kernel • Installed on 1 system (hare02) • SuSE Linux 9.1 Professional Beta • Final Release Candidate • 2.6.4 kernel • Installed on 2 systems (hare01 & hare03)
Why SuSE? • We had access to supported releases of both kernel levels for comparison. • SuSE 9.1 support by PSC is just around the corner • SuSE 8 is already supported • We've never benchmarked it before • We like SuSE • We hate SCO
Planning the Installation • 8 drive array • 1 drive for all standard filesystems • 1 drive for bi (mounted as /bi) • 6 drives for database extents (raid 0, mounted as /db) • 128k stripe size (biggest allowed by controller) • Same partitioning scheme used on all three systems • Ext3fs used for all root volume filesystems
Installing the OSes • Initial SLES 8 installation failed • A bit of googling revealed that we needed “acpi=oldboot” as part of the install parameters • Install then went fine • SuSE 9.1 installed without a problem • hare01 set to boot with anticipatory scheduler (default) • hare03 set to boot with deadline scheduler (elevator=deadline)
Installing Progress • Progress install failed on SuSE 9.1 • Needed environment var LD_ASSUME_KERNEL=2.4.1 set. • Fixed in 9.1D09 • Tells kernel to use old threading scheme for your processes • This is really a glibc artifact. • Installed fine on SLES 8
Planning the Benchmarks • We decided to perform a few similar benchmarks with each version of Progress on each system for comparison • Then we would explore different features on the different systems • We also decided that if the anticipatory scheduler was slower (as expected) then we would dump it entirely
Initial Benchmark Runs • We prepared and loaded databases on each system. • Type 1 storage areas were used • Each system was given a basic Progress tuning • The next step was to benchmark • The runs commenced at about 3PM on Day 1
The ATM Benchmark • Simulates teller machine transactions • deposit or withdrawal • heavy database update workload (no think time) • Each transaction • retrieves and updates account, branch, and teller rows • creates a history row • Run “n” transaction generators • concurrently • for fixed time period • count total number of transactions performed
The ATM Database • 80,000,000 accounts • 80,000 Tellers • 8,000 branches • 4k blocksize database • 6 2-gig fixed data extents (12 gig) • plus 1 variable extent • 1 variable (bigrow’n) extent
The Baseline Tuning • Cluster size: 16384 • BI blocksize: 16 • Server options: • -n 200 -L 10240 • -B 64000 • -spin 50000 -bibufs 32 • Page writers: 4 • BI writer: yes • AI writer: no
Analyzing the Results • In general the numbers were significantly higher than our last set of benchmarks on pc-grade, dual processor servers • The overall differences were measurable between various OS and Progress versions, but not overwhelming
End of Day One • As we expected, we saw better performance out of the 2.6 kernel than out of 2.4. • Additionally, the deadline scheduler outperformed the anticipatory scheduler • We ran a few more benchmarks. • Then we had beer
Benchmarking Continued • When we ran the same benchmarks on 10.0a and 10.0b we generally got similar results • At this point we began looking for more performance gains • The 2.6 kernel, with the deadline scheduler stayed ahead of the 2.4 kernel, but still had issues • We dumped the anticipatory scheduler • Performance stayed < 800 TPS
Linux Performance Issues • We observed issues with buffer pool management in the 2.6.4 kernel • We saw long pauses in database activity • We saw system sync's (sync command) take over 2 minutes to return. • This led us to start investigating tuning the vm
Tuning the VM • The behavior of the 2.6 VM is controlled by settings in /proc/sys/vm /proc/sys/vm/max_map_count /proc/sys/vm/min_free_kbytes /proc/sys/vm/lower_zone_protection /proc/sys/vm/nr_hugepages /proc/sys/vm/swappiness /proc/sys/vm/nr_pdflush_threads /proc/sys/vm/dirty_expire_centisecs /proc/sys/vm/dirty_writeback_centisecs /proc/sys/vm/dirty_ratio /proc/sys/vm/dirty_background_ratio /proc/sys/vm/page-cluster /proc/sys/vm/overcommit_ratio /proc/sys/vm/overcommit_memory
Maybe –directio Will Help • We continued to tune the buffer pool with no significant gains • At the time, we didn’t try tuning swappiness. That is on our list for next time. • What about directio? • directio uses O_DSYNC • We have buffer pool problems • So we tried it (again)
-directio doesn’t help • Here are 3 runs with –directio • Here are 3 runs without –directio • The difference is significant Cl Time Trans Tps Conc Avg R Min R 50% R 90% R 95% R Max R --- ---- ------ ------ ----- ----- ----- ----- ----- ----- ----- 150 300 70733 235.8 149.8 0.6 0.0 0.6 1.0 1.2 3.2 150 300 76648 255.5 128.4 0.5 0.0 0.5 1.0 1.1 2.9 150 300 72231 240.8 149.6 0.6 0.0 0.6 1.0 1.2 6.6 Cl Time Trans Tps Conc Avg R Min R 50% R 90% R 95% R Max R --- ---- ------ ------ ----- ----- ----- ----- ----- ----- ----- 150 300 191806 639.4 149.4 0.2 0.0 0.1 0.5 1.0 17.2 150 300 210064 700.2 149.7 0.2 0.0 0.0 0.2 0.5 38.9 150 300 203901 679.7 142.8 0.2 0.0 0.1 0.2 0.5 39.9
Linux Filesystems • 4 filesystems are now supported by most distributions • Ext3fs (journaling ext2, RedHat standard) • Reiserfs (journaling fs, SuSE standard) • Xfs (journaling fs from SGI) • Jfs (journaling fs from IBM) • ext3, reiser and jfs were tested • Xfs had problems formatting the bi volume • SuSE 9.1 beta bug, now fixed in 9.1 FCS
Other Filesystem Info • All filesystems were mounted noatime • But no gain has been benchmarked with this option • Probably because of fixed DB extents • Tests were run using the mount option of ‘writeback’ for the db files on ext3 • Bi was mounted “ordered” • No performance gain was detected from this
Reiser –vs- Ext3fs Best Reiser Run No Cl Time Trans Tps Conc Avg R Min R 50% R 90% R 95% R Max R -- --- ---- ------ ------ ----- ----- ----- ----- ----- ----- ----- 3 50 300 420611 1402.0 48.7 0.0 0.0 0.0 0.1 0.1 56.6 Best Ext3 Run No Cl Time Trans Tps Conc Avg R Min R 50% R 90% R 95% R Max R -- --- ---- ------ ------ ----- ----- ----- ----- ----- ----- ----- 32 55 300 370781 1235.9 54.8 0.0 0.0 0.0 0.1 0.1 15.7
Filesystems Findings • Reiser again beats ext3fs • 13% higher TPS w/Reiser (Statistically significant) • 360% higher max R (Also statistically significant) • What is ‘max R’ Gus? • Concurrency is lower on Reiser too. • Yet 95% of the Txn’s finished just as quickly
More Linux Info • At lunch one day we glanced in the window of a beer store! And Discovered this! We liberated it and installed it in the Bunker
Type 2 Storage Areas • These are new in 10.0a • They were formerly called “data clusters” • Require a dump/load to implement • New syntax for structure file • Type 1 • d "data":7,256 /db/100at2/atm_7.d1 f 2000000 • Type 2 • d "data":7,256;512 /db/100at2/atm_7.d1 f 2000000
Type 2 Storage Areas • Massive improvement!!!!! • The first run jumped from < 800 to over 1000 tps! • From this point on, we used type 2 areas for all benchmarks • We also did not experience the severe buffer issues in Linux using Type 2 areas
Progress Tuning • Tuning Spin • baseline • 2.4 linux • 10.0a • Type 1 areas
Progress Tuning • Buffer size • baseline • 2.4 Linux • 10.0a • Type 1 areas
Extent Sizes • Extent Size • baseline • 2.4 linux • 10.0a
Bi Clustersize • Bi Cluster size • baseline • 2.6 linux • 10.0a • Type 1 areas
Varying User Count • Ran several long sessions where the user count varied from 10 to 550 • This was on 2.6 linux w/deadline scheduler • Settings were • Cluster size: 16384 • BI blocksize: 16 • Server options: • -n 600 -L 10240 • -B 128000 • -spin 50000 -bibufs 64 • Page writers: 8, BI writer: yes, AI writer: no
Varying The User Count Note: no “think time”
50 to 2,000 users Note: 0 to 3 sec random “think time”
Well….What did we learn ? • SuSE 9.1 BETA is very nice • Gus prefers over RedHat • 2.6 kernel has many improvements • We need to learn a lot more about 2.6 • We don’t understand the new VM