350 likes | 363 Views
Get an overview of storage requirements, concepts, options, and priorities for SQL servers. Understand storage strategies, HA/DR implications, and general guidance. Suitable for DBAs responsible for small to medium SQL deployments.
E N D
SQL Storage for the Real World Brett Roux brettroux.blogspot.com
Overview • Requirements • Storage concepts • Storage options and priorities • SAN / DAS • HA / DR implications • Decide on a storage strategy • General guidance • References and more information • Q/A
Who Am I ? • I am a DBA • I am not a sales person • What qualifies me? • I don’t have all the answers! • Personal opinion: • Almost any reliable storage solution could be used effectively for your SQL server • Almost any solution could be a bad purchase if it is implemented incorrectly or does not fit in with your overall storage strategy
Who Should Be Here? • Aimed at DBA’s responsible for small to medium SQL deployments • Will NOT be focusing on extreme capacity or performance solutions • Going to avoid tuning, as Kevin covered this in the previous session • Focus on overall strategy and options available – so you can narrow your own focus in the future
Why Change Your Storage? • Is your storage just old? • Unable to extend warranty? • Is your storage the bottleneck? • Have you got enough RAM? • Do you just need more capacity? • Are you standardising or looking to fit in with the rest of the company’s storage plan? • Do you have a new application to support?
Know Your Data! • Understanding your data and storage usage is the key to everything • Is your environment OLTP, DSS or OLAP? • Do you need bandwidth or IO’s? • What read/write ratio? • How much data is there? • How much of that data is active? • Daily trends and monitoring • Use perfmon and DMV’s to investigate • Are you really using your TempDB?
Storage Concepts • What are IO’s? • 8KB – 1024KB • What is disk latency? • Measured in ms • 20ms limit not always relevant • What is bandwidth? • Measured in Gb • Cache • Random vs sequential • No way to truly measure • Very little chance of absolutes • Virtualisation tends to make all IO’s random • Queue Depth • Read/Write Ratio • Important with SSD’s
IO Size vs Throughput http://download.intel.com/support/network/sb/inteliscsiwp.pdf
Direct Attached Storage • Easy to implement • Known quantity and easier to troubleshoot • No longer limited in terms of performance • Small footprint • Potentially the cheapest solution • Cannot be used for clustering by default • Several SAN benefits can be achieved when using storage virtualisation on DAS • Can’t be affected by other systems
PCIe SSD • New standard in terms of latency • Completely changed perceptions of DAS • Limited capacity • High cost/capacity, but low relative to performance • Fantastic for Denali TempDB • Can affect CPU and RAM utilisation • Multiple vendors with varying performance and quality
Storage Area Networks • Getting more affordable • Massive benefits in large deployments • Powerful functionality • Snapshots • Online capacity expansion • Automated tiering • SAN replication • Thin provisioning • Boot from SAN can increase availability • Increased redundancy is possible • Shared environment, so others can affect SQL
Framed vs Frameless SAN Framed SAN Frameless SAN • Has dedicated head unit • Most common • High initial cost • Lower expansion cost • Capacity expansion only • Head units can become a bottleneck • Expensive head unit upgrades • Each shelf has controller • Newer generation • Lower entry cost • Higher expansion cost • Increased performance with expansion • Limited capacity / scalability
Rackmount SSD • Generally independent units • Although some can act as a SAN • Individual unit capacity is limited, and is usually bought up front • Incredibly fast but often lacking additional functionality • Alternatively very feature rich, but without track record • Possibly higher latency than PCIe SSD, but allows for clustering or expansion when slots are limited
Choosing A Vendor • Give all the vendors the same set of requirements and let them propose solutions • For example: IO requirement, usable capacity, connectivity, form factor, read/write ratio, lifespan and budget for the project • Feedback to vendors and allow them to improve the solution iteratively – they won’t get it right first time • Make sure you are getting the best pricing! • Some vendors first quote is 50% off retail... • Build a relationship – storage is never once off! • Make sure the solution is both balanced and redundant
General Guidance • Separate Data, Logs and TempDB • Separate DB’s if performance is justified • Put logs on fastest disk possible • For write heavy DB’s or OLTP • Remember that log writes are sequential... • OLAP and DSS generally like bandwidth and can handle more latency • Make data files of equal size • Pre-size data and log files • Autogrow loves to work at the worst times
TempDB • TempDB can be very IO heavy, but is easy to move • Check for contention on TempDB • Consider adding data files if required • Start at 0.25 per core and move up • Additional log files are not beneficial • Storage engine blog on allocation bottlenecks: • http://bit.ly/q1JDJN • Denali allows for TempDB on local storage in clustered instances
Storage Configuration • Write cache is generally more useful to SQL • You may need to change the number of data files to get the best from your storage • Check for bottlenecks in the data path • CHECK PARTITON ALIGNMENT !!! • http://msdn.microsoft.com/en-us/library/dd758814(v=sql.100).aspx • Check your MPIO is working • Update storage firmware and use latest drivers
Storage Testing • Benchmark to validate, baseline and test • Test the redundancy – pull some cables and drives... • Ensure you test more than your cache • SQLIO • http://bit.ly/p85dJj • IOMeter • http://bit.ly/oewi6g • SQLIOSim • http://bit.ly/rbXt6A • AS SSD Benchmark • http://bit.ly/pTlk0k
Diagnostic Tools • Logical Disk • Avg Disk Sec/Read (Write) - Latency • Disk Read Bytes/sec (Write) – Bandwidth • Disk Reads/sec (Writes) – IO’s • Avg Disk Queue Length • Deprecated but high numbers can indicate room for improvement • Boot LUN doesn’t count unless page file is being hit or you are booting from SAN • sys.dm_io_virtual_file_stats(DB_ID(), NULL) • Itzik Ben-Gan on IO stats: http://bit.ly/rcy2TX • sys.dm_os_wait_stats • Look for PAGEIOLATCH_* • sys.dm_db_index_usage_stats
References • Storage Search • storagesearch.com • MS Storage Best Practices: • http://bit.ly/qLfyli • http://bit.ly/oajks8 • http://bit.ly/oZsZ71 • Glenn Berry’s blog: • sqlserverperformance.wordpress.com • Intel Whitepaper: Simple, Reliable Performance for iSCSI Connectivity. • http://bit.ly/o95NTL • My Blog: • brettroux.blogspot.com • Brent Ozar: • brentozar.com • MCM Training video’s: • http://bit.ly/p7IlSR • Storage Mojo • storagemojo.com • AnandTech • anandtech.com • Paul Randall’s blog: • sqlskills.com/BLOGS/PAUL • Joe Chang’s blog: • sqlblog.com/blogs/joe_chang/
Summary • Requirements • Storage concepts • Storage options and priorities • SAN / DAS • HA / DR implications • Decide on a storage strategy • General guidance