790 likes | 851 Views
ASE 113 How to make Penguins Fly Maximizing Adaptive Server Enterprise Performance on Linux. Girish Vaitheeswaran Staff Software Engineer Girish@sybase.com August 15-19, 2004. The Enterprise. Unwired. The Enterprise. Unwired. Industry and Cross Platform Solutions. Manage Information.
E N D
ASE 113 How to make Penguins Fly Maximizing Adaptive Server Enterprise Performance on Linux Girish Vaitheeswaran Staff Software Engineer Girish@sybase.com August 15-19, 2004
The Enterprise. Unwired. Industry and Cross Platform Solutions Manage Information Unwire Information Unwire People • Adaptive Server Enterprise • Adaptive Server Anywhere • Sybase IQ • Dynamic Archive • Dynamic ODS • Replication Server • OpenSwitch • Mirror Activator • PowerDesigner • Connectivity Options • EAServer • Industry Warehouse Studio • Unwired Accelerator • Unwired Orchestrator • Unwired Toolkit • Enterprise Portal • Real Time Data Services • SQL Anywhere Studio • M-Business Anywhere • Pylon Family (Mobile Email) • Mobile Sales • XcelleNet Frontline Solutions • PocketBuilder • PowerBuilder Family • AvantGo Sybase Workspace
Contents • No Introduction to Linux • System Performance 101 • De-mystifying Processors • Let’s not forget Memory • Reading and Writing [Disk I/O] • Sending and Receiving [Network I/O] • Identifying Overall System Performance Issues. • Known problems and Solutions • More information • Conclusion
No Introduction to Linux • Does Linux need an introduction!! • Non Microkernel based architecture (monolithic kernel) • Multitasking, Secure virtual memory OS • Different flavors of Linux available • Redhat, Suse, Redflag etc • Supported on various hardware platforms. • Intel (Xeon, Pentium, Itanium), AMD (Athlon, Opteron), IBM, SGI, SUN etc • Importantly, appealing to Bean Counters
Linux Trends • Consolidation from big boxes to commodity hardware • Migration to 4CPU P3’s with 4/8GB RAM running some flavor of Linux • Exploiting Moore’s law • Moving from 750Mhz CPU’s to 1.5GHz CPU’s with hyperthreading • Sizing based on Increased clock speeds.
ASE on linux • ASE on Linux supported since 11.0.3.3 • ASE 12.5.0.3 supports RH2.1 • ASE 12.5.1/12.5.2 will support RH 3.0 • SuSE/RedFlag supported as well. • More info available at www.sybase.com/linux/ase
I/O Memory Processor System Performance 101 • How do I get my hardware and software to do more work without buying more hardware ? • System Performance mainly depends on 3 areas
De-mystifying Processors • Classes of processors • Xeon with HT Technology (CISC) • Pentium with HT Technology (CISC) • Itanium EPIC Architecture • AMD Opteron • What is hyper threading • Doing more work in each clock cycle by providing thread level parallelism in each processor. • Itanium EPIC Architecture • Explicitly Parallel Instruction Computing implies that the compiler decided what sets of instructions execute together. Use for high end database and ERP applications. • The AMD Opteron Story • Provides the ability to run 32 bit applications and 64bit applications on the same hardware.
Consolidation • Consolidation Guidelines • One 1.5GHz CPU does not necessarily yield the same performance as two 750MHz CPU’s. Various parameters to account for are • L1/L2/L3 cache sizes (Internal CPU caches) • Memory Latencies • Cycles per instruction • Number of engines to run • On HT enabled processors 1 physical CPU can run 2 engines • Having a HT enabled processor is not equivalent to having 2 physical processors • Use Hyper threading carefully as it has been known to hurt performance in quite a number of instances. • Careful benchmarking needs to be done to evaluate if HT works for your application.
Processor Tips • Identifying Number of processors/Clock speed/Hyper threading % cat /proc/cpuinfo
Processor Tips • Determining cpu usage (mpstat, top, vmstat) % mpstat 5 5
Processor Tips • Enabling Hyperthreading • Enabled by default in most processors • Note that if “ht” is shown in the cat /proc/cpuinfo output does not mean HT is enabled. • Disabling Hyperthreading • Can be disabled during BIOS setup • This has been useful in cases where there are multiple engines and the load on the cpu’s is close to 100%
ASE Engines Unleashed • ASE Engines are Linux processes that schedule tasks • ASE Engines are multi-threaded • ASE Performs automatic load balancing of tasks • ASE has automatic task affinity management • Tasks tend to run on the same engine that they last ran on to improve locality • Linux does not have an ability to explicitly bind engines to processors. (Not Yet) • RH AS 2.1 has built in process-processor affinity • Internal benchmarks have demonstrated ASE’s ability to scale to 64 engines.
ASE Engines Unleashed • 2 configuration parameters control the number of engines • The sp_engine stored procedure can be used to “online” or “offline” engines dynamically • Tune the number of engines based on the “Engine Busy Utilization” values presented by sp_sysmon • Extra dataserver threads [RH 7.2 only] • For Posix aio support
Monitoring and Tuning Engines • sp_sysmon’s kernel section reports utilization as shown sp_sysmon “00:02:00”, kernel
Monitoring and Tuning Engines • Influencing kernel utilization • CPU bound tasks • I/O bound tasks • Tuning runnable process search count • I/O polling process count
Logical Process Management • Logical process management can be used to influence the priority of tasks or to do load balancing by using engine groups. • E.g.Housekeeper tuning for aggressive garbage collection
Logical Process Management • I/O bound tasks and cpu bound tasks can be balanced by using engine groups. E.g. Mixed work load scenario running a resource hogging reporting application and an Online reservation at the same time. Step1 : Create 2 engine groups and associate engines to engine groups
Logical Process Management • Step 2 : Display information about execution objects exec sp_showcontrolinfo • Step 3 : Create 2 execution classes onl_reservation_execlass and reporting_execlass exec sp_addexeclass onl_reservation_execlass, MEDIUM, 0, onl_reservation_engroup exec sp_addexeclass reporting_execlass, MEDIUM, 0, reporting_engroup
Logical Process Management • Step 4: Bind application logins to the respective execution class exec sp_bindexeclass “onl_sa”, LG, NULL, onl_reservation_execlass exec sp_bindexeclass “reporting_sa”, LG, NULL, reporting_execlass • Step 5 : Validate binding information exec sp_showexeclass
Some more Engine related Tunes • Runnable process search count determines the number of times ASE engines loop looking for runnable tasks before yielding to the OS. • Default value is good in general • Tune this parameter only if all of the below are true • There are multiple applications running on the same machine and you require ASE to yield to the OS so that the other applications can be scheduled • The average cpu busy utilization is < 5%
Some more Engine related Tunes • I/O polling process count determines the number of processes ASE runs before checking for Network or Disk I/O. • Tune this parameter only if all the following conditions are met • Increase the value if the total I/O checks is very high and the Avg Disk I/O’s per check or Avg Net I/O’s per check is very low. • If the avg cpu utilization is between 70-90%
Let’s take a Checkpoint • Hyperthreading is not equivalent to having a physical processor • Just clock speed does not give performance • Add more engines if Engine busy utilization is high • Logical process management for priority scheduling and mixed workloads
Let’s not forget “Memory” • Memory is a very critical parameter to obtain overall system performance • Every disk I/O saved is performance gained. • Tools to monitor and manage memory
Using upto 2.7G in 12.5.1 • Users can use 2.7G of memory out of the box by just changing “max memory” parameter in ASE 12.5.1 Note that 2 shared memory segments have been created one for 1.98G and one for 667M
Using upto 2.7G [12.5.0.3 and below] • In ASE 12.5.0.3 and below, to use upto 2.7G on Linux do the following • Max configurable shared memory • 2.7GB addressable memory
Large Memory Support (12.5.2) • Configuring more than 2.7GB of memory on 32 bit Linux Systems. • Configuring ASE to use the maximum memory available on the system. • Linux on IA32 with PAE can support upto 64G • AS 2.1 can go up to 16G • RHEL 3 can go upto 64G • Pre- ASE 12.5.2, can only access 2.7G • To use additional memory use FS devices
Advantages of Large Memory Support • Extra memory can now be used by data cache. • Reduced number of physical i/os ( read and write) • Efficient usage of memory • Better response time. • Improved throughput. • Efficient use of resources. • Flexibile Usage Dynamicaly configurable • Auto Tuning
Customer (Implementer) Usage • Configuring /dev/shm • 8GB memory space reserved to create shm files. • Configuring ‘extended cache’ • ‘extended cache size’ – new config parameter • Size in ‘2K’ pages. • create, extend, delete are all dynamic • sp_configure, sp_helpconfig
Customer (Implementer) Usage • Startup [Cache Manager] extended cache size = 102400 # 2G • Run-time • creation sp_configure ‘extended cache size’, 102400 go • extension sp_configure ‘extended cache size’, 204800 #To 4G go • deletion sp_configure ‘extended cache size’, 0 go
Performance Comparison • Order Processing benchmark • 144 warehouse 10G database • Mixed Read and write • Multiple named caches • 2.7G to primary data caches • 3 G of Extended cache • Devices used • Raw • File system
Monitoring Memory • View Memory parameters [ in kb] % free –k % cat /proc/meminfo
Configuring ASE memory • sp_configure “max memory” to tune memory configured for ASE. [Dynamic option since 12.5] • Tune this parameter based on ASE resource requirements • Remaining memory does not go to “default data cache” starting ASE 12.5 • Do I have extra memory ?
Monitoring and Tuning ASE Parameters • To tune various ASE memory parameters sp_monitorconfig “all” • If Reused column has “yes” watch out.
Memory Tuning Tips • Lock Shared memory • Guarantees shared memory to be in RAM • Improves performance • Tune through sp_configure interface • Static option • Validate through message in errorlog 11:23:08.33 kernel Locking shared memory into physical memory.
Named Caches • Sizing Caches key to improved performance • Cache Partitions improve performance and scalability • How ? • Create the Named cache with required size • Bind the cache to the hot table
Named Caches • What to bind • Transaction Log • Tempdb • Hot objects • Hot indexes • When to use Named caches • sp_sysmon “Data Cache Management” section reports > 10% spinlock contention • sp_sysmon provides the recommendation to do so. • Hot lookup tables, frequently used indexes, tempdb activity, high transaction throughput applications are all good scenarios for using named caches. • How to determine what is hot ?? • Cache Wizard
Cache Partitions • Cache Partitions help improve scaling • Decomposes the cache spinlock • Recommendation is to use as many cache partitions as there are engines. • How
Named Caches Vs Cache Partitions • Which one should I use ? • Named caches • Easily identifiable hot objects, indexes • Transaction Log • Tempdb • Cache Partitions • Complex applications with many objects • Named cache with heavy spinlock contention • Both • Best fit is to have named caches and cache • partitions
Auto tuning of cache partitions • To reduce spinlock contention • Pre 12.5.2, default cache partitions = 1 • With 12.5.2, auto tuning of cache partitions • Number of engines is greater than or equal to 2. • Only for default data cache • When ‘local’ and ‘global’ = DEFAULT and size > 100M. • sp_helpcache, sp_cacheconfig should show the new tuned cache partitions. • Configuration file will not be written with the new value. • New message added in errorlog.
Let’s take a Checkpoint • Every disk I/O saved is performance gained • sp_monitorconfig to tune procedure cache, worker threads etc • Named caches help improve performance and scalability • Cache partitions + Named Caches best combination • If application has large number of objects have as many cache partitions as engines • Tempdb, transaction log, hot indexes, hot objects are ideal candidates
Reading ’n’ Writing [Disk I/O] • I/O avoided is Performance Gained • ASE buffer cache has algorithms to avoid/delay/Optimize I/Os whenever possible • LRU replacement • MRU replacement • Tempdb writes delayed [Improved select into performance] • Write ahead logging [Only Log is written immediately] • Group commit to batch Log writes • Coalescing I/O using Large buffer pools • UFS support • Raw devices and File systems supported • Asynchronous I/O is supported
Raw Devices • Raw devices provide exceptional write performance and good read performance • Recommended for Transaction Log
File Systems • File System Caching can be effectively used to improve performance (especially reads). • File Systems as Secondary cache for ASE • Enables ASE to use > 2.7GB • Very useful as pages not fitting in ASE cache are accommodated in FS Cache • Helps avoid expensive disk I/O • Many File System Flavors on Linux • extfs, xfs. IBM's JFS and the Reiserfs • Recommended file system • EXT2 • EXT3 with journaling disabled
What file system to use • EXT3 with journaling disabled 24%
File system vs Raw devices • When to use File system • Frequent reads • Infrequent writes • E.g. tempdb [WITH ‘DSYNC’ off] using sp_deviceattr stored procedure 1> sp_deviceattr “tmpdbdev","dsync","false" 2> go 'dsync' attribute of device ‘tmpdbdev' turned 'off'. Restart Adaptive Server for the change to take effect. • When to use Raw devices • Frequent writes • Infrequent reads • E.g Transaction log • How does one compare against the other ?