1 / 64

Storage Network Designs for OLTP Business Continuity

Learn the best storage network designs for OLTP systems to ensure continuous business operations. Understand redundancy methods, storage technologies, and HA approaches for optimal performance.

maxineg
Download Presentation

Storage Network Designs for OLTP Business Continuity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storage Network Designs for OLTP Business Continuity Marc Farley President, Building Storage Networks, Inc.

  2. Agenda • The Vendor Neutral Approach • Overview of OLTP &High Availability • I/O Redundancy Methods • Storage Network Technologies • Storage Networking for HA OLTP

  3. Vendor Neutral Approach • Generic terms, not vendor terms • Assumed basic knowledge of SAN, NAS, RAID

  4. And now, for something completely different…..

  5. OLTP Environments • Mission critical business applications • Business in real-time • Expensive equipment and software • Aggressive performance objectives • Highly skilled IT staff • Hands-on computing operations

  6. OLTP Database Software • Oracle, • 8i Oracle Parallel Server (OPS) • 9i Real Application Cluster (RAC) • IBM • DB2 UDB • Informix • MS SQL Server • Sybase, My SQL, others

  7. OLTP OS Platforms • IBM S/390 MVS • Unix Systems • Windows 2000+ • HA Linux

  8. OLTP Requirements • 99.999% uptime • Non-degrading response time • High transaction rates • Seamless scalability • Cost relief

  9. Database Storage Approaches • Raw parititions • Bypass OS I/O buffering • File system • Facilitates data management • NFS mounted • Offload DB server, NTAP + Oracle

  10. ACID Properties of OLTP • Atomicity– No partial transactions • Consistency– All tables are in a consistent state before and after a completed transaction • Isolation– One transaction cannot contaminate other transactions • Durability–Transactions are complete only when the database updates are written to disk storage

  11. Challenges of OLTP • Major systems integration effort • Intricate tuning and monitoring • Little tolerance for errors • Complex data structures & relationships • Time and sequence-sensitive processes • Must be adhered to for data integrity • Shifting workloads and bottlenecks

  12. OLTP Database Files • Data files • Database data, tablespaces • Redo log files, archive log files • Reconstruct or rollback transactions • Control files • File layout information

  13. OLTP Table Space Storage • Use many spindles to distribute hot spots • RAID 0+1 recommended • File system recommended over raw partitions • Easier data management

  14. Striping for Performance RAID Controller (Microsecond performance) DiskDrive DiskDrive DiskDrive DiskDrive DiskDrive DiskDrive Disk Drives (Millesecond performance)From rotational latency and seek time

  15. My Personal Favorite, RAID 0+1 RAID Controller DiskDrive DiskDrive DiskDrive DiskDrive DiskDrive DiskDrive DiskDrive DiskDrive DiskDrive DiskDrive 1 2 3 5 4 Mirrored Pairs of Striped Members

  16. OLTP Redo Log Storage • Raw partitions recommended • Sequential high speed writes • Separate mirror pairs per log file group • Capacity for 30 – 60 minutes of data • Goal is to limit disk contention for current and active log files

  17. OLTP Archive Log Storage • File system or NFS mounting is required • NFS mounting is recommended • Mirroring or RAID • Goal is to have easy access in case they are needed for reconstruction

  18. High Availability • The ability for a system or application to immediately continue its mission after loss or damage to system components, systems, facilities and data

  19. Expected Scaling limitations Processor Storage capacity Network Consolidations Product life cycles Unexpected Failures Bugs Virus Operator errors Disasters Availability Threats

  20. HA Engages All Elements • Systems • Application • Network connections • Network services • Storage and I/O subsystems

  21. Scoping the Risks

  22. Managing the Risks • Local copies of data • Immediate availability • (Remote) Nearby • Immediate availability to several hours • Remote Far away • One to several days availability

  23. Disaster/Availability Radii Local Remote Nearby Remote Far Away

  24. Nobody Expects….. • Weird things to happen to them • Disintegration of media • Underground flooding through tunnels • Fires in Telco switching centers

  25. High Availability for OLTP • Duplication of functions • Without degrading performance • Without risking data integrity • Brute force techniques • Automation and efficiency • Cost is always an issue • And high availability DOES cost

  26. Jedi Jim Gast Marc Skyfaller Farley A Long Time Ago in a Job Not So Far Away……………. You must learn the to be a master of redundancy it if you are going to be a storage geek. Remember Marc, there is only one concept: REDUNDANCY! Redundancy. Again! Got it Jim. Let’s Eat! Whatever

  27. Eventually, I Learned to Appreciate His Teachings…… • REDUNDANCYNSPoF (No Single Point of Failure) Don’t get the giant spicy Polish for lunch – its too much for the digestion

  28. OLTP HA Requires Complete Redundancy Protection • Client network • Server systems and components • Application modules • I/O Channels and Networks • Storage subsystems and components • Data

  29. A Quick Look At Clustered Storage Shared Everything Shared Nothing Both servers share control of a common storage address space Each server controls its own storage address space

  30. Examples of OLTP Clusters Microsoft SQL Server Oracle 9.1 RAC Data is exchanged between servers Data is accessed directly from storage Failoverpaths only

  31. One more time, with subsystems… Microsoft SQL Server Oracle 9.1 RAC All storage is shared by all cluster nodes Same subsystem but different address spaces

  32. I/O Redundancy • Host to subsystem • Mirroring: Host to independent targets • Multi-pathing: Host to a single target • Subsystem to subsystem • Store and forward: • Local • Remote

  33. Disk Mirroring: Redundant storage targets Independent, identically sized storage address spaces One controller Two controllers

  34. Disk Mirroring: I/Os to 2 Targets • “Brute force” redundancy: fast and simple • Both read and write I/Os • Overlapped reads for performance • Local connections • Limited capacity* • I/O Bottlenecks* for random I/O activity • * if targets are disk drives

  35. Disk Mirroring for Redo Log Files • Log files are a common bottleneck • Use raw partitions • Redundancy is required • Mirroring is adequate • Use highest RPM with lowest seek times • Put on a separate channel from database I/O • Use separate mirrored pairs per group

  36. Mirroring to Storage Subsystems StorageSubsystem Independent, identically sized storage address spaces Two controllers StorageSubsystem

  37. Mirroring to Subsystems • Targets are subsystems, not disks • Separate address spaces • Capacity scales to subsystem max • Double level redundancy • Mirroring plus RAID • Multiple disk spindles reduces I/O bottlenecks

  38. Disk Mirroring Datafiles from Host to Storage Subsystems • Disk mirroring + subsystem RAID • Excellent capacity scaling • Adjacent and across campus/town • One subsystem outside site radius • Requires longer distance cabling • Reads and writes both transmitted

  39. Multi-Pathing: Redundant Paths Between a Host & Subsystem X Application data volume Pathing software determines that a transmission error occurs & switches to a redundant path

  40. Multi-pathing vs Mirroring • Mirroring assumes independent, but similar storage targets • Multi-pathing assumes multiple paths to the exact same target • Mirroring can use a single HBA, multi-pathing needs two HBAs

  41. Path Failures 1 3 2 1. HBA problem Application data volume 2. Link, switch or network problem 3. Subsystem controller problem

  42. I/O sent to storage No ack received Transmission failures recognized after SCSI timeouts are exceeded The I/Os is retried and eventually an error is passed back to the process that issued the I/O

  43. Path Failover for OLTP I/O • Redundant path resources take over activities for a failed path to sustain operations without disrupting service or risking data integrity

  44. Store and Forward Independent, identically sized storage address spaces Host A B

  45. Store & Forward: One Host I/O and Two Copies of Data • Only real option for remote copies • Does not forward read I/Os • Proprietary protocols and methods • Standards are emerging ie. FC/IP • First step to storage snapshots

  46. ACK ACK I/O I/O Forward Forward ACK Store and Forward: Acknowledgements Asynchronous Synchronous B B A A

  47. Trade-offs withAcknowledgement Handling • Synchronous • Always preferred • Slowest performance • State of copy is precise • Asynchronous: • Fastest performance • Least precise knowledge of copy status

  48. Store & Forward: Local and Remote Copies • Local & nearby copy techniques • Synchronous • Fiber optic cabling, optical/DWDM services • Remote-far away copy techniques • Asynchronous • ATM gateways, OC-12 or less, FC/IP

  49. Mirroring Async I/O Reads and writes No snapshot tie-in Uses more host slots Least costly Store and Forward Async or Sync I/O Writes only Snapshot ready May conserve host I/O slots Most costly Mirroring vs Synchronous Store and Forward for Local & Nearby Copies

  50. Combining Mirroring with Store and Forward Store and Forward Radius Local Nearby Remote Far Away Mirroring Radius

More Related