760 likes | 918 Views
REP707 Benchmarking Sybase Large Scale Replicated Systems for Success. Doug Levy - SR. Technical Lead - Mail Operations Kobi Lifshitz - SR. Technical Advisor - Mail Operations Dongsheng Lui - Benchmarking & Disk Subsystem Specialist - Mail Operations dlevy64671@AOL.com 8/21/2014.
E N D
REP707 Benchmarking Sybase Large Scale Replicated Systems for Success Doug Levy - SR. Technical Lead - Mail Operations Kobi Lifshitz - SR. Technical Advisor - Mail OperationsDongsheng Lui - Benchmarking & Disk Subsystem Specialist - Mail Operations dlevy64671@AOL.com8/21/2014
Benchmarking Sybase Replicated Systems Introductions • Doug Levy • SR. Technical Lead, DBA Team Lead - Mail Operations, AOL • Kobi Lifshitz • SR. Technical Advisor for Mail Operations - AOL • Dongsheng Liu • Benchmarking & Disk Subsystem Specialist - Mail Operations, AOL
Benchmarking Sybase Replicated Systems Table of Contents • Preface & Purpose • Base Assumptions • Why and when to Benchmark • Approach To Effective Benchmarking • Benchmark Case Study
Benchmarking Sybase Replicated Systems Preface • WHY would you run your own benchmarks when…… • Vendors already perform industry standard TPC-C/H/R/W? • Benchmarks can be very expensive to set-up and run? • Benchmarks can be a very time consuming effort? • It is often very difficult to mimic a production environment?
Benchmarking Sybase Replicated Systems Preface - Definition of Benchmark • Webster.com on-line resource defines benchmark as follows: • A point of reference from which measurements may be made. • Something that serves as a standard by which others may be measured or judged. • A standardized problem or test that serves as a basis for evaluation or comparison (as of computer system performance).
Training Developers Operations support Storage Environment Hardware Capacity Performance/# of Disks RAID Mirroring Striping Machine Class Storage Choices Operations Software Maintenance Rollout Support Operating System 3rd Party Home grown Connectivity Type of Network Speed of Network Switches & Routers Benchmarking Sybase Replicated Systems Preface - Total Cost of Operations (TCO) • TCO from a systems (macro) perspective:
Security/Data Integrity Data Protection Redundancy Firewalls Encryption Etc. Host/ASE Configuration Memory/CPU Other Resource Allocation “Compartmentalization” Transaction Profile Rates Data Transfer Volumes Resource Contention Transportability Scalability Unit of scaling Ease of scaling Platform Variety Data Transparency Benchmarking Sybase Replicated Systems Preface - Total Cost of Operations (TCO) • TCO from a database (micro) perspective:
Resource Prime Process Prime New Equilibrium Current Equilibrium Benchmarking Sybase Replicated Systems Preface - Cost/Benefit Analysis • PURPOSE of cost/benefit analysis: Evaluate Tradeoffs Resource Process Benefit = throughput/capacity TCO = cost per unit effort
Benefit Benefit Cost Cost Benefit Benefit Equilibrium Equilibrium Equilibrium Equilibrium Cost Benchmarking Sybase Replicated Systems Preface - Cost/Benefit Analysis • COST/BENEFIT analysis - Different Models: Cost
Benchmarking Sybase Replicated Systems Purpose of this presentation • In today’s business environment it is imperative to have a mechanism for assessing the costs and benefits of any given strategy; a key mechanism for this assessment is the benchmark.
Benchmarking Sybase Replicated Systems Purpose of this presentation • In this presentation we will provide a best practices framework for benchmarking large scale replicated systems, and illustrate how to implement this framework with a case study from our own experience.
Benchmarking Sybase Replicated Systems Base Assumptions SYSTEM is large scale. • Implies a large install base. • Implies some degree of application uniformity. • Implies some degree of install base diversity. • Implies a synergistic relationship with vendors. SYSTEM is replicated RESOURCES are limited. • Implies high availability needs. • Implies 7 x 24 operating environment FUNCTIONALITY is established via QA. • No permanent benchmark environment. • Limited time for benchmarking. • Limited human resources for benchmarking. PRODUCTION READINESS established by benchmark. • Benchmark and application loosely coupled. • Focus - ASE servers and repservers. • Focus - ASE server hosts. • Focus - back end storage devices. • Performance • Fault Tolerance • Production Viability • TCO
Benchmarking Sybase Replicated Systems WhyBenchmark? DEV/QA is not production HELPS minimize unexpected production downtime Often a smaller, more resource constrained environment. May not have some components - replication for example. Host/disk resources may differ from production. ASE/RS versions/configuration may not be like production. • Find out about hardware/software faults in advance. • Provide a reproducible test case to vendors. HELPS avoid emergency rollouts and capacity issues • Determine system storage limits in advance. • Establish useful ‘fullness models.’ • Establish useful ‘degradation models.’ HELPS avoid performance issues in production. HELPS determine system resource limits in advance. Determine system bottlenecks in advance. Determine optimal configurations in advance. Establish connection/network capacity limits. Establish memory/cache utilization capacity limits. Establish i/o channel/service time capacity limits. Establish CPU utilization capacity limits.
Benchmarking Sybase Replicated Systems When To Benchmark? • WHENEVER system characteristics are expected to change. Usage patterns. Number of simultaneous users. System components beyond your direct control. • WHENEVER you change: Host or storage hardware/software. Host or storage vendor. Major host or storage OS revisions. ASE or Repserver Version(s). Underlying database schema or stored procedures. Application functionality. ASE/RS/Host settings (for P&T or other reasons).
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • POSESS a highly automated installation process. • Allows rapid deployment. • Takes advantage of available Hardware. • Optimizes human resources. • Minimizes TCO. • DEFINE operational parameters. • IDENTIFY benchmark goals. • Goals need to be well defined and clearly stated. • GENERATE an effective cost/benefit model. • Evaluate pursuing stated goals based on cost/benefit analysis.
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking DEFINE the most critical evaluation parameters. DEFINE Benchmark Metrics - in general, metrics collection should be evaluated case by case. • Metrics depend on a defined goal. • Each environment has different needs. THERE do however, appear to be some higher level metrics that are almost universally applicable….
CPU • Average • Max Memory • Utilization • Swapping Network • Bandwidth • Balancing • Response times Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Host/OS/Network Metrics
I/O Rates • Average IOps • Maximum Iops • Service Times Cache Utilization • Average Utilization • Saturation Limits Channel Load • Load Balancing • Throughput Disk Activity • Load Distribution • Hot Spots Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Back End Storage Metrics
Engines • Average Utilization • Max Utilization Cache • Data • Procedure • ULC Throughput & Response Time • R/W/U/D times • TPS • Total Volume Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Sybase Server Metrics
Fullness • Capacity • Fragmentation Data Profiles • Size Profile - data size s over time • DIST. Profile - data distribution s over time • Optimizer Profile - costing s over time • I/O Profile - s in i/o characteristics over time Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Database Metrics
Replicate Activity • CPU • Memory Transaction Log Profile • Average Sizes • Max Sizes Latency • Average/Max • Queue Depths Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Replication Metrics
Replicate Queue depths: DSI performance Maintenance user - Replicate ASE/DB Primary Queue depths: SQM/SQT performance RSI performance Transaction logs: Repagent performanceLong Running Xacts Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Replication Metrics & Latency Issues
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Identify the right tools for the job. • Choose the tools that cover all key metrics. • Build your own tools if you have to. • Parsing scripts may also be needed. • Use the same monitoring tools for benchmarking as production. • Provides comparison between benchmark and production. • Provides a convergence between benchmark and production. • Allows comparison between expected and actual results. • Minimizes TCO. • SELECTING Monitoring Tools Takes advantage of existing tools. Reduces effort involved in benchmarking.
VMSTATNETSTAT SAR TOP GLANCEPERFORMANCE VIEW3RD PARTY CPU • Average • Max Memory • Utilization • Swapping Network • Bandwidth • Balancing • Response times Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Host/OS/Network Monitoring Tools
PROPRIETARY VENDOR SOFTWARE Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Back End Storage Metrics I/O Rates • Average IOps • Maximum Iops • Service Times Cache Utilization • Average Utilization • Saturation Limits Channel Load • Load Balancing • Throughput Disk Activity • Load Distribution • Hot Spots
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Back End Storage Metrics Proprietary Vendor Software: IBM (Shark) - ESS Expert EMC (Symmetrix) - Workload Analyzer HITACHI (XP1024) HP - Performance Advisor XP HITACHI (SE9980) - Sun - Graph Track/Performance Monitor
SP_SYSMON SP_MONITOR SP_MONITORCONFIG SP_COUNTMETADATA MONSERVER/HISTSERVER SYBASE CENTRAL DBXRAY SPINMON Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Sybase ASE Monitoring Tools Engines • Average Util. • Max Util. Cache • Data • Procedure • ULC Throughput & Response Time • R/W/U/D times • TPS • Total Volumes
DBCC COMMANDS TRACE FLAGS SET STATISTICS SET SHOWPLAN SET FMTONLY SP_SPACEUSED SP_ESTSPACE OPTDIAG Fullness • Capacity • Fragmentation Data Profiles • Size Profile - data size s over time • DIST. Profile - data distribution s over time • Optimizer Profile - costing s over time • I/O Profile - s in i/o characteristics over time Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Database Monitoring Tools
HISTSERVER OS TOOLS HEARTBEATS REPSERVER MGR HOME GROWN TOOLS Transaction Logs • Average Sizes • Max Sizes SP_SPACEUSED Replicate Activity • CPU • Memory Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • Replication Monitoring Tools Latency • Avg/Max • Queue Depths
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • CAUTIONARY notes - metrics and tools • Be aware of the “Heisenberg Uncertainty Principle”. • Don’t try to analyze too much data. • Attempting to track and analyze too many variables becomes confusing. • Focus on metrics within the scope of the goal. • But…perform sanity checks on metrics that may indirectly impact your goal. • Remember that benchmarks are not production. • Baseline Benchmarks need to be run. • Don’t get too caught up in discontinuities between benchmark and production data.
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • BENCHMARK driver host considerations • Separate the driver host(s) from the server host(s). • Configure the driver host ‘generously.’ • Consider a permanent driver box. • Provide lots of memory - otherwise swapping may throw numbers. • Provide as much CPU as possible - otherwise task contention may throw numbers. • Provide a large amount of disk space to store output. • Keep the driver host patched and up to date. • Periodically monitor the driver host to prevent data skewing.
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • BENCHMARK driver application design considerations • Use multiple threads to simulate multiple users. • Parent application spawns children. • Multiple shell/perl scripts from shell/perl/other driver. • Consider dedicated R/W/D/U processes. • May keep things simpler, but may be unrealistic. • Try to minimize the amount of time and money spent: • Port production code as much as possible. • User interface should be present and easy to use, but minimal. • Emphasize recording statistics and generating data. • Raw per transaction data should be available. • Provide a debug mode with verbose output.
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • BENCHMARK driver runtime design considerations • Use KEY files as much as possible/necessary. • Use to simulate read and update distribution patterns. • Use to simulate delete patterns (if needed). • Use DISTRIBUTION files as much as possible/necessary. • Use key files to simulate write value histograms. • Use distribution files to simulate write sizes. • Use RANDOMIZATION to get closer to production (if possible). • Consider using random key generation for reads if possible…. • Or try to randomize key distribution inside reader threads. • Randomize key/size distribution inside writer threads.
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • BENCHMARK server data design considerations • If possible, run benchmarks with data in the target db’s. • More realistic behavior allows anticipation of scaling issues. • Running with little data will not provide a realistic view of ‘steady state’. • Sparsely populated data does not reveal potential optimizer issues (especially important for replicated environments). • Populate databases with data as close to production as possible. • Production dump/load - this is best but not always possible. • BCP - consider running ‘primer’ benchmarks to ‘mess up’ data. • Populate with the benchmark driver using distribution files. • Preserve populated data for reuse in later benchmarks if possible.
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • BENCHMARK process - overall considerations • Build structure into the process. • Well defined goals must direct the process. • Tightly controlled process management minimizes scope creep. • Controlled change management keeps the effort on track. • Benchmark evaluation/direction should be semi-formal. • Focus group should evaluate goals. • Focus group should determine a plan. • Focus group should evaluate results periodically. • Follow well structured and documented procedures. • Benchmark tests must have identical sets of monitoring tools. • Checklists minimize human error and ensure correct monitoring.
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • BENCHMARK process considerations - baseline • Determine acceptance criteria. • Clear differentiation needs to be made between static and dynamic variables. • Establish a baseline. • Start with a configuration as close to production as possible. • Disk = static • Configuration = Dynamic • Run baseline benchmarks • Device layouts • Database layouts • Schema and procedures • Collect and verify all data. • Compare baseline data to production. • OS configuration files • ASE configuration files • RS configuration dumps • All monitoring output • Specifics of benchmark
Benchmarking Sybase Replicated Systems Approach To Effective Benchmarking • BENCHMARK process considerations - change mgmt. • Benchmarks should be conducted by varying only a single variable at a time (if possible). • Multiple runs should be performed for a single configuration to eliminate random anomalies. • Collect, verify and catalog all data - centralized/indexed. • Run periodic consistency checks to verify integrity. • Focus group monitors and controls process flow. • DBCC checkstorage • DBCC checkdb • DBCC checkalloc • RS_SUBCMP • Focus group meets and reviews data. • Focus group determines next steps. • Focus group decides when process must conclude. • Process must conclude at some point.
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL Benchmarking ASE 12.5 4K page sizes AOL Mail Case Study
AOL Client Internet Mail Processes Gateways Email Hosts ASE Dataservers Back End Storage Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • AOL Mail Architecture (partial).
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • AOL Email Content Subsystem • ONE of a number of systems comprising Mail. • OUR role - to store message content for AOL member R/W operations. • OUR mission - to provide Email Access at the fastest speed possible at 7x24 availability for AOL members. • LARGE install base of fully replicated, cloned, mid-range servers. • ASE 11.9 - 12.5 ;11.5-12.5 RS. • Varied OS Hardware/Software install base. • Large amount of high end storage units. • Process ~ 500,000,000 recipients daily.
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • AOL Email Content Subsystem Replication Architecture Highly customized replication environment. Fully capable of 100% bi-directional replication. Currently there are true primaries and replicates. All components are geographically separated. “Open Switch”-like application interface for managing P/R access. Primary Site Replicate Site Bi-directional Replication
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • AOL Email Content Architecture
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • HYPOTHESIS : • Given our current knowledge of data distribution and processing flow, upgrading to ASE 12.5.0.3 with 4K page sizes should improve throughput and more efficiently utilize backend resources subsystem wide.
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • BENCHMARK goals: • Validate that 4K page sizes will enable us to reach higher overall ‘steady state’ levels per database with no increase in latency. • Determine if we can change our scaling model to be more cost effective (vertical vs.. lateral) within acceptable latency and throughput limits. • Determine component failure rates to minimize initial risks associated with this architecture, if we decide to implement it. • Reduce our cost figure per stored message (cost per unit effort).
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • EVALUATE goal and effort - make decision • Focus group analyzes proposed benchmark plan: • ‘Seed’ initial environment with production-like data. • Run benchmarks - establish 2K 12.5 baselines. • Use custom C program (BCP API) to move data to 4K servers. • Run 4K 12.5 benchmarks. • Modify schema to take advantage of 4K pages. • Use another custom C BCP program to migrate data to new schema. • Run 4K 12.5 modified schema benchmarks. • Run DBCCs periodically to verify integrity.
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • Focus group analyzes cost/benefit of implementing proposed changes. • Costs of implementation - • Migration effort 12.0.0.X to 12.5.0.3. • Upgrade tools, upgrade rollout procedures. • Human resource opportunity costs. • Potential benefits - • Lower infrastructure costs. • Better hardware utilization . • New features to take advantage of. • EVALUATE goal and effort - make decision
HW Investment Prime HW Investment P&T Effort Prime P&T Effort New Equilibrium A new equilibrium is established at a higher throughput/capacity per unit. Current Equilibrium Benchmarking Sybase Replicated Systems Preface - Cost/Benefit Analysis • Cost/Benefit Analysis: Evaluate Tradeoffs Performance and tuning costs get a ‘lift’ due to establishment of new 4K baseline, shifting P&T curve “left”. Hardware costs shift “left” due to establishment of new more efficient baselines for existing Hardware with 4K. Cost per unit effort = [ ( number of messages per db at steady state*) x (number of databases per host) x (number of hosts in subsystem) x 2 for replication]/ Total cost of maintaining the system#. * Where steady state = maintained, targeted balance between inserts, deletes and storage capacity. # Where total cost of maintaining the system includes all hardware, software, and operational expenses. Benefit = throughput/capacity TCO = cost per unit effort
Host/ASE CPU/Memory utilization • Average • Max Fullness • Capacity • Fragmentation Latency • Averages • Queue Depths Response times • Read Times • Write Times • Delete Times Back End • Cache • IOps Throughput • Total Volume Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • CHOOSE metrics
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • CHOOSE Monitoring Tools Host CPU/Memory utilization • Average • Max Performance View
Benchmarking Sybase Replicated Systems Benchmark Case Study - Email Content Benchmark at AOL • CHOOSE Monitoring Tools • Performance View is HP performance analysis software, but works with most systems. • Combines monitoring of CPU, memory, network and disk I/O into one single graphic interface. • Centralized monitoring, monitors multiple hosts from a single point of view. • Provides archiving, stores historical performance and data can be exported to text files. • Graphic representations of all performance data over any specified period. • Can do comparison graphs between different hosts, or plot multiple matrixes on the same graph.