620 likes | 3.26k Views
ASE133: Performance Tuning of ASE with special emphasis on Linux. How to make Penguins Fly. Girish Vaitheeswaran Staff Software Engineer Sybase, Inc. girish@sybase.com. Contents. No Introduction to Linux System Performance 101 De-mystifying Processors Let’s not forget Memory
E N D
ASE133: Performance Tuning of ASE with special emphasis on Linux How to make Penguins Fly Girish VaitheeswaranStaff Software EngineerSybase, Inc. girish@sybase.com
Contents • No Introduction to Linux • System Performance 101 • De-mystifying Processors • Let’s not forget Memory • Reading and Writing [Disk I/O] • Sending and Receiving [Network I/O] • Conclusion
No Introduction to Linux • Does Linux need an introduction!! • Non Microkernel based architecture (monolithic kernel) • Multitasking, Secure virtual memory OS • Different flavors of Linux available • Redhat, Suse, Redflag etc • Supported on various hardware platforms. • Intel (Xeon, Pentium, Itanium), AMD (Athlon), IBM, SGI, SUN etc • Importantly, appealing to Bean Counters
Linux Trends • Consolidation from big boxes to commodity hardware • Migration to 4CPU P3’s with 4/8GB RAM running some flavor of Linux • Exploiting Moore’s law • Moving from 750Mhz CPU’s to 1.5GHz CPU’s with hyperthreading • Sizing based on Increased clock speeds.
ASE on linux • ASE on Linux supported since 11.0.3.3 • ASE 12.5.0.3 supports RH2.1 • ASE 12.5.1 will support RH 3.0 • SuSE/RedFlag supported as well. • More info available at www.sybase.com/linux/ase
I/O Memory Processor System Performance 101 • How do I get my hardware and software to do more work without buying more hardware ? • System Performance mainly depends on 3 areas
De-mystifying Processors • Classes of processors • Xeon with HT Technology • Pentium with HT Technology • What is hyper threading • Doing more work in each clock cycle by providing thread level parallelism in each processor • Advantages of hyper threading • Support for multi-threaded code and multi-tasking operations through better utilization of processor resources. • Multiple threads/tasks running simultaneously to increase the number of transactions that can be executed. • Improved reaction and response times for end users. • Increased number of users a server system can support.
Consolidation • Consolidation Guidelines • One 1.5GHz CPU does not necessarily yield the same performance as two 750MHz CPU’s. Various parameters to account for are • L1/L2/L3 cache sizes (Internal CPU caches) • Memory Latencies • Cycles per instruction • Number of engines to run • On HT enabled processors 1 physical CPU can run 2 engines • Having a HT enabled processor is not equivalent to having 2 physical processors
Processor Tips • Identifying Number of processors/Clock speed/Hyper threading % cat /proc/cpuinfo
Processor Tips • Determining cpu usage (mpstat, top, vmstat) % mpstat 5 5
Processor Tips • Enabling Hyperthreading • Enabled by default in most processors • Note that if “ht” is shown in the cat /proc/cpuinfo output does not mean HT is enabled. • Disabling Hyperthreading • Can be disabled during BIOS setup • This has been useful in cases where there are multiple engines and the load on the cpu’s is close to 100%
ASE Engines Unleashed • ASE Engines are Linux processes that schedule tasks • ASE Engines are multi-threaded • ASE Performs automatic load balancing of tasks • ASE has automatic task affinity management • Tasks tend to run on the same engine that they last ran on to improve locality • Linux does not have an ability to explicitly bind engines to processors. (Not Yet) • RH AS 2.1 has built in process-processor affinity • Internal benchmarks have demonstrated ASE’s ability to scale to 64 engines.
ASE Engines Unleashed • 2 configuration parameters control the number of engines • The sp_engine stored procedure can be used to “online” or “offline” engines dynamically • Tune the number of engines based on the “Engine Busy Utilization” values presented by sp_sysmon • Extra dataserver threads [RH 7.2 only] • For Posix aio support
Monitoring and Tuning Engines • sp_sysmon’s kernel section reports utilization as shown sp_sysmon “00:02:00”, kernel
Monitoring and Tuning Engines • Influencing kernel utilization • CPU bound tasks • I/O bound tasks • Tuning runnable process search count • I/O polling process count
Logical Process Management • Logical process management can be used to influence the priority of tasks or to do load balancing by using engine groups. • E.g.Housekeeper tuning for aggressive garbage collection
Logical Process Management • I/O bound tasks and cpu bound tasks can be balanced by using engine groups. E.g. Mixed work load scenario running a resource hogging reporting application and an Online reservation at the same time. Step1 : Create 2 engine groups and associate engines to engine groups
Logical Process Management • Step 2 : Display information about execution objects exec sp_showcontrolinfo • Step 3 : Create 2 execution classes onl_reservation_execlass and reporting_execlass exec sp_addexeclass onl_reservation_execlass, MEDIUM, 0, onl_reservation_engroup exec sp_addexeclass reporting_execlass, MEDIUM, 0, reporting_engroup
Logical Process Management • Step 4: Bind application logins to the respective execution class exec sp_bindexeclass “onl_sa”, LG, NULL, onl_reservation_execlass exec sp_bindexeclass “reporting_sa”, LG, NULL, reporting_execlass • Step 5 : Validate binding information exec sp_showexeclass
Some more Engine related Tunes • Runnable process search count determines the number of times ASE engines loop looking for runnable tasks before yielding to the OS. • Default value is good in general • Tune this parameter only if all of the below are true • There are multiple applications running on the same machine and you require ASE to yield to the OS so that the other applications can be scheduled • The average cpu busy utilization is < 5%
Some more Engine related Tunes • I/O polling process count determines the number of processes ASE runs before checking for Network or Disk I/O. • Tune this parameter only if all the following conditions are met • Increase the value if the total I/O checks is very high and the Avg Disk I/O’s per check or Avg Net I/O’s per check is very low. • If the avg cpu utilization is between 70-90%
Let’s take a Checkpoint • Hyperthreading is not equivalent to having a physical processor • Just clock speed does not give performance • Add more engines if Engine busy utilization is high • Logical process management for priority scheduling and mixed workloads
Let’s not forget “Memory” • Memory is a very critical parameter to obtain overall system performance • Every disk I/O saved is performance gained. • Tools to monitor and manage memory
Using Large Memory • Users can use 2.7G of memory out of the box by just changing “max memory” parameter in ASE 12.5.1 Note that 2 shared memory segments have been created one for 1.98G and one for 667M
Using Large memory [12.5.0.3 and below] • In ASE 12.5.0.3 and below, to use Large memory on Linux do the following • Max configurable shared memory • 2.7GB addressable memory
Monitoring Memory • View Memory parameters [ in kb] % free –k % cat /proc/meminfo
Configuring ASE memory • sp_configure “max memory” to tune memory configured for ASE. [Dynamic option since 12.5] • Tune this parameter based on ASE resource requirements • Remaining memory does not go to “default data cache” starting ASE 12.5 • Do I have extra memory ?
Monitoring and Tuning ASE Parameters • To tune various ASE memory parameters sp_monitorconfig “all” • If Reused column has “yes” watch out.
Memory Tuning Tips • Lock Shared memory • Guarantees shared memory to be in RAM • Improves performance • Tune through sp_configure interface • Static option • Validate through message in errorlog 11:23:08.33 kernel Locking shared memory into physical memory.
Named Caches • Sizing Caches key to improved performance • Cache Partitions improve performance and scalability • How ? • Create the Named cache with required size • Bind the cache to the hot table
Named Caches • What to bind • Transaction Log • Tempdb • Hot objects • Hot indexes • When to use Named caches • sp_sysmon “Data Cache Management” section reports > 10% spinlock contention • sp_sysmon provides the recommendation to do so. • Hot lookup tables, frequently used indexes, tempdb activity, high transaction throughput applications are all good scenarios for using named caches. • How to determine what is hot ?? • Cache Wizard
Cache Wizard • A new option to sp_sysmon “cache wizard” has been added in 12.5.1 to help in • Identifying hot objects in a cache • Evaluating effectiveness of Large buffer pools • Sizing data caches. • Evaluating effectiveness of APF
Cache Wizard : Usage • Usage sp_sysmon interval [, cache wizard [, top_N [, filter] ] ] • Ranking Criterion • LogicalReads / sec • Always in decreasing order of PhysicalReads / sec • Filter clause • Caches containing ‘filter’ pattern
Cache Wizard : Examples • sp_sysmon ’00:05:00’, ‘cache wizard’, ‘2’, ‘default data cache’ default data cache Buffer Pool Information Object Statistics Cache Occupancy Information
Cache Wizard : Recommendations • Identifying “hot” objects default data cache Object Statistics Cache Occupancy Information • If Cache Hit% is low For each Object • If LR/sec is high and Obj hit% is low, move object to a new cache • OR add memory to the cache.
Cache Wizard : Recommendations • Effectiveness of large buffer pools, apf default data cache Buffer Pool Information • If Pool usage is high and Pool Hit% is low, add memory to the buffer pool • APF% effectiveness provides information on how many pages brought in on account of APF got used. • If Pool hit % is low and APF effectiveness is high, then consider increasing APF percentage.
Cache Partitions • Cache Partitions help improve scaling • Decomposes the cache spinlock • Recommendation is to use as many cache partitions as there are engines. • How
Named Caches Vs Cache Partitions • Which one should I use ? • Named caches • Easily identifiable hot objects, indexes • Transaction Log • Tempdb • Cache Partitions • Complex applications with many objects • Named cache with heavy spinlock contention • Both • Best fit is to have named caches with cache partitions
Let’s take a Checkpoint • Every disk I/O saved is performance gained • sp_monitorconfig to tune procedure cache, worker threads etc • Named caches help improve performance and scalability • Cache partitions + Named Caches best combination • If application has large number of objects have as many cache partitions as engines • Tempdb, transaction log, hot indexes, hot objects are ideal candidates
Reading ’n’ Writing [Disk I/O] • I/O avoided is Performance Gained • ASE buffer cache has algorithms to avoid/delay/Optimize I/Os whenever possible • LRU replacement • MRU replacement • Tempdb writes delayed [Improved select into performance] • Write ahead logging [Only Log is written immediately] • Group commit to batch Log writes • Coalescing I/O using Large buffer pools • UFS support • Raw devices and File systems supported • Asynchronous I/O is supported
Raw Devices • Raw devices provide exceptional write performance and good read performance • Recommended for Transaction Log
File Systems • File System Caching can be effectively used to improve performance (especially reads). • File Systems as Secondary cache for ASE • Enables ASE to use > 2.7GB • Very useful as pages not fitting in ASE cache are accommodated in FS Cache • Helps avoid expensive disk I/O • Many File System Flavors on Linux • extfs, xfs. IBM's JFS and the Reiserfs • Recommended file system • EXT2 • EXT3 with journaling disabled
What file system to use • EXT3 with journaling disabled 24%
File system vs Raw devices • When to use File system • Frequent reads • Infrequent writes • E.g. tempdb [WITH ‘DSYNC’ off] using sp_deviceattr stored procedure 1> sp_deviceattr “tmpdbdev","dsync","false" 2> go 'dsync' attribute of device ‘tmpdbdev' turned 'off'. Restart Adaptive Server for the change to take effect. • When to use Raw devices • Frequent writes • Infrequent reads • E.g Transaction log • How does one compare against the other ?
And the winner is…. • Bottom line : Use mix of File System and Raw devices. 60%
Asynchronous I/O • Enables ASE to service user tasks after I/O is issued • The recommended scheme for doing I/O • AIO supported on Raw devices and File Systems on Linux • Enabled by default • Posix aio and Kernel Supported AIO [RH AS 2.1] are supported • What should I use ??
Asynchronous I/O tunes • fs.aio.max-size specifies the maximum block size performed by one aio read or aio write • For optimal create database, alter database performance this should be tuned to 1048676(1MB) • To tune this parameter