1 / 43

Evaluating and testing storage performance for Oracle DBs

Explore testing and performance evaluation methods for Oracle DB storage at CERN, including hardware setup, RAID configuration, and the benefits of using Oracle ASM with commodity hardware.

ptanner
Download Presentation

Evaluating and testing storage performance for Oracle DBs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating and testing storage performance for Oracle DBs Luca Canali, CERN Dawid Wojcik, CERN UKOUG, Birmingham, December 1st, 2009

  2. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik CERN and LHC experiments

  3. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Outline • Storage for Physics DB Services @ CERN • Architecture: 10g RAC + ASM on commodity HW • DBAs and storage – our working model • Performance testing and (new) HW evaluation • Tools: ORION, SQL-based test harness • Discussion of some test results • What is interesting to measure • FC commodity HW, tests of iSCSI 1 and 10Gig • Measuring IO performance from prod DBs • Our experience of production workload measurement • Metrics for capacity planning • Testing for robustness • Pre-prod stress testing

  4. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik CERN Set-up • Dual-CPU quad-core 2950 DELL servers, 16GB memory, Intel 5400-series “Harpertown”; 2.33GHz clock • Dual power supplies, mirrored local disks, 4 NIC (2 private/ 2 public), dual HBAs, “RAID 1+0 like” with ASM

  5. Compressing Very Large Data Sets in Oracle, Luca Canali Mirroring Striping Striping Oracle Cluster Storage, ASM • ASM for mirroring and striping across storage arrays • Allows the use of commodity HW (mirror across arrays) • Disks can be added and removed online for scheduled and • unscheduled changes • Example: • disk groups: data and flash recovery DATADG RECODG

  6. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Why ASM and commodity HW • Storage market is very conservative • Enterprise storage typically priced much higher than consumer storage • Opportunities • Commodity HW/grid-like solutions • provide order of magnitude gain in • cost/performance

  7. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Service Management and DBAs • DBAs at CERN work across the full stack of activities necessary to run the DB service: • User-facing: • match users requirements with DB deployments • help developers to optimize their DB usage • Technology-facing: • Involved also in configuration and tuning of ‘lower level stack’ • Guarantee SLA, tune SQL execution, backup, security, replication • Database software installation, monitoring, patching

  8. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik DBAs and Storage Testing • Opportunity for testing new HW • Test for performance • Test for stability • Measure what we know are the critical metrics • Often this means small-read random IOPS

  9. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik HD – the basic element • Hard disk technology • Basic block of storage since 40 years • Main intrinsic limitation: latency

  10. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik HD specs • HDs are limited • In particular seek time is unavoidable (7.2k to 15k rpm, ~2-10 ms) • 100-200 IOPS • Throughput ~100MB/s, typically limited by interface • Capacity range 300GB -2TB • Failures: mechanical, electric, magnetic, firmware issues. • In our experience with ~2000 disks in prod we have about 1 disk failure per week

  11. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Enterprise disks? • Performance • Enterprise disks offer more ‘performance’ • They spin faster and have better interconnect protocols (e.g. SAS vs SATA) • Typically of low capacity • Our experience: often not competitive in cost/perf vs. SATA • Reliability • Evidence that low-end and high end disks don’t differ significantly • Still an open question

  12. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Scaling out the disk • The challenge for storage systems • Scale out the disk performance to meet demands • Throughput • IOPS • Latency • Capacity • Sizing storage systems • Sizing must focus on critical metric(s) • Depend on the workload of the DB • Avoid ‘capacity trap’

  13. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik RAID and redundancy • Storage arrays with RAID are the traditional approach • implement RAID to protect data. • Parity based: RAID5, RAID6 • Stripe and mirror: RAID10 • Scalability problem of RAID • For very large configurations the time between two failures can become close to RAID volume rebuild time (!) • That’s also why RAID6 is becoming more popular than RAID5

  14. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Oracle ASM way • Physics DB storage uses Oracle ASM • Volume manager and cluster file system integrated with Oracle • Oracle files are divided in chunks • Chunks are distributed evenly across storage • Chunks are written in multiple copies (2 or 3 it depends on file type and configuration) • Allows the use of low-cost storage arrays: does not need RAID support (JBOD is enough)

  15. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik VLDB and storage interconnect • Throughput challenge • It takes 1 day to copy/backup 10 TB over 1 GBPS network

  16. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Techology • Several technologies available with different characteristics • SAN • NAS • iSCSI • Direct attach

  17. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Fiber channel SAN • FC SAN is currently most used architecture for enterprise level storage • Fast and low overhead on server CPU • Used for physics DBs at CERN and Tier1s • SAN networks with 64 ports at low cost • Measured: 8 Gbps transfer rate (4+4 dual ported HBAs for redundancy and load balancing) • Proof of concept FC backup (LAN free) reached full utilization of tape heads • Scalable: proof of concept ‘Oracle supercluster’ of 410 SATA disks, and 14 dual quadcore servers

  18. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik How to Test the Whole stack? • Oracle workload based tests • Testing using production copies and application-like behaviour • Best option but very difficult • Benchmarking • Not the real thing but can save a lot of time • Oracle’s ORION • Tests only the access to storage • Uses async IO as ASM would do • Results from our tests prove it reliable to compare storage solutions

  19. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Testing storage with ORION • ORION • Oracle utility that has proven to give similar results as more complex SQL-based tests • In the following some examples of results • IOPS measured for various disk types • FC results • iSCSI 1 Gbps and 10 GigE results

  20. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Metrics of interest • Basic IO metrics measured by ORION • IOPS for random I/O (8KB) • MBPS for sequential I/O (in chunks of 1 MB) • Latency associated with the IO operations • Simple to use • Getting started:./orion_linux_em64t -run simple -testname mytest -num_disks 2 • More info: https://twiki.cern.ch/twiki/bin/view/PDBService/OrionTests

  21. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik ORION output, an example

  22. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik ORION results,small random read IOPS

  23. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Example of HW testing • 14 servers

  24. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Test hardware • 8 FC switches: 4Gbps (10Gbps uplink)

  25. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Test hardware • 26 storage arrays (16 SATA disks each)

  26. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Results • Measured, sequential I/O • Read: 6 GB/sec • Read-Write: 3+3 GB/sec • Measured, small random IO • Read: 40K IOPS (8 KB read ops) • Note: • 410 SATA disks, 26 HBAS on the storage arrays • Servers: 14 x 4+4Gbps HBAs, 112 cores, 224 GB of RAM

  27. SQL-Based Custom Testing • A custom SQL-based DB workload: • IOPS: Probe randomly a large table (several TBs) via several parallel queries slaves (each reads a single block at a time) • MBPS: Read a large (several TBs) table with parallel query • The test table was 5 TB in size • To reduce storage caching effects • Test table was probed by a small table via nested loop join • V$systat and v$system_event data used to ensure each row of the ‘probe table’ when joined caused a random read on the large test table

  28. SQL-Based vs. ORION • SQL-based vs. ORION • SQL-based uses more CPU • Closer to a ‘real workload’ • Each physical IO in Oracle needs a logical IO too, that is CPU + latch, etc • ORION just stresses async IO of the OS • Recent experience • Tested a config of 371 raptor disks of 150 random IOPS per disk • SQL-based test maxed out CPU (60% usr, 40% sys) at 42K random IOPS • Should have gone up to 55K IOPS

  29. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik iSCSI Storage • iSCSI is interesting for cost reduction • Some trends to get rid of ‘specialized’ FC network • Reuse existing infrastructure and know-how • Not applicable to all systems • IP interconnect throughput • CPU usage • Adoption seems to be only for low throughput systems at the moment • However 10GigE tests are very promising

  30. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik iSCSI 1 Gbps • Orion scalability tests, small IO, FC vs. iSCSI • ./orion_linux_x86-64 -run advanced -write 0 -matrix point -duration 120-testname mytest -simulate raid0 -num_disks 24 -cache_size 500-num_small 1000 -num_large 0 Double the number of actual disks to push them to their limits

  31. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik iSCSI 1 Gbps • Orion scalability tests, large IO, FC vs. iSCSI • ./orion_linux_x86-64 -run advanced -write 0 -matrix point -duration 120-testname mytest2 -simulate raid0 -num_disks 24 -cache_size 500-num_small 0 -num_large 1000

  32. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik iSCSI on 10 GigE • We have recently tested custom iSCSI 10GigE storage • ‘CERN-made’ disk servers that export storage as iSCSI over 10 GigE • 2 Clovertown quad-core processors of 2.00 GHz • 8 GB of RAM • 16 SATA-II drives of 500 GB, 7'200 rpm • RAID controller 3ware 9650SE-16ML • Intel 10GigE dual port server adapter PCIexpress (EXPX9502CX4 - Oplin) • HP Procurve 10GigE switch Data: H. Meinhard, CERN

  33. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik iSCSI tests, 10 GigE • Preliminary results • ORION tests with up to 3 disk arrays (14 disks each) • Almost linear scalability • Up to 42 disks tested -> 4000 IOPS at saturation • 85% CPU idle during test • IOPS of a single disk: ~110 IOPS • Promising interconnect technology, if it can be made rock solid like FC • Performance of iSCSI stacks varies a lot from version to version Data: A. Horvath, CERN

  34. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Disclaimer • Simple testing with ORION is not the same as production workload testing • CPU cycles also important for IOPS • You need to combine different IO patterns • ... see next slides • Testing has many sides • We need reasonable numbers quickly • Results should allow comparison between platforms and HW types • Results needs to be confronted with real life performance

  35. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Storage Capacity Planning • Capacity planning – iterative process, also for ‘stable’ services • Size is not the only metric for storage • Measure and understand current system utilization • Identify bottlenecks (if any) • Prepare forecast for future growth • Plan for HA • Try to plan for the ‘unexpected’ • Service expansion • Plan stability issues • HW failures • Consolidate if possible – better resource utilization

  36. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Storage Capacity Planning • Capacity planning requires historical data • Diagnostic Pack (AWR – DBA_HIST_SYSMETRIC_SUMMARY) • Custom probing (V$SYSMETRIC_SUMMARY) • Choose some important metrics: • Physical Read Total Bytes Per Sec • Physical Read Total IO Requests Per Sec • Physical Write Total Bytes Per Sec • Physical Write Total IO Requests Per Sec • Redo Generated Per Sec • Redo Writes Per Sec • ... • DB size

  37. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Capacity Planning – plots

  38. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Capacity Planning – plots

  39. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Reproducing IO load pattern • Historical data allows to set-up simple IO tests that will resemble production workload • Reproduce different IO sizes (large vs small IO ratio) • Reproduce mixed workload (read/write ratio) • Reproduce cluster-wide IO load (nodes IO ratios) • Single node testing trap - parallel (multi-node) IO tests required to detect some potential SCSI level or storage issues (both performance and stability) • Test for spikes (short term and long term) • Test on storage of similar size • Cache influence, uneven platter performance • Watch for storage cache • Disable, purge, warm-up

  40. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Testing for Robustness • Our experience • Putting in production new HW and OS versions needs extensive testing • Especially true with commodity HW • Lessons learned after few years of running different tests • Mixed read/write (if possible) • Mixed IO size • Simulate workload from several nodes • Periodic testing: high load, low load, high load ... • Proper testing should also detect most of the 'infant mortality’ cases • Beware – disk or storage firmware issues can take long time to debug

  41. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Testing for Robustness • Some issues can still appear only on a running production system • Be proactive – schedule for tests! • ASM normal redundancy – backup validate check logical is not enough! • Corruption can affect secondary extents only! • We have developed a tool that scans all ASM files and compares primary and secondary extents • Set-up media scans on the storage arrays in low activity periods • Test HA features (new kernel version may have bugs)

  42. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Conclusions • Storage is key for successful • DB implementations • Testing different technologies • DBAs can profit by getting a deeper knowledge of their storage subsystem • Typical case: new HW evaluation, pre-prod stress testing • There are simple tools for testing, ORION • Monitoring and measuring production environment • Helps in keeping service level/ production performance • Provides data for capacity planning

  43. Testing Storage Performance for Oracle DBs – L.Canali, D.Wojcik Acknowledments • Many thanks to Physics DB team at CERN, in particular for this work: • Maria Girone, Jacek Wojcieszuk • http://phydb.web.cern.ch/phydb/ • Physics DB@UKOUG later today: • Eva Dafonte Pérez, 13:15, “Worldwide distribution of experimental physics data using Oracle Streams” • Luca Canali, 15:55, “Compressing very large data sets in Oracle”

More Related