430 likes | 446 Views
This presentation by Grzegorz Łyp at SQLBits 2018 delves into strategies for efficient database maintenance, covering topics such as index fragmentation, statistics, and speeding up maintenance tasks for extensive databases. Discover practical demonstrations using the World Wide Importers database and learn essential maintenance techniques for improving database performance and reliability.
E N D
Parallel Database Maintenance with 24/7 Systems and Huge DBs SQLBits 2018
Grzegorz Łyp • MS SQL Architect & Optimizer • MS SQL, .NET & Java Developer • Product and Project Manager • 15 years in professional IT • 10 years in MS SQL • Member of Data Community Poland • Blog: www.lyp.pl/blog • Mail: grzegorz.lyp@assecobs.pl • Twitter: @GrzegorzLyp • LinkedIn: https://www.linkedin.com/in/grzegorzlyp/ • Facebook: https://www.facebook.com/profile.php?id=100001054878385
Agenda • Background • Database Maintenance • Speedupmaintenancetasks • Lockingduringdatabasemaintenance • Creatingspecificstatistics
DEMOS • WorldWideImporters • https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0
Background • Once upon a time there was… a database • At the moment of start it worked for 8 hours a day… • DBA had rest of the time for maintenance
Background • As the databaseownercompanygrewdatabaseusagegrew • Aftercertainamount of time the maintenacetime went down to „0” • DBA losthisjob as he could not do the databasemaintenance in suchconditions • Did he try hard ? • Let’stry to satisfy 24/7 system maintenanceneeds!
Database Maintenance • Indexes • Statistics
Indexes • Index fragmentation • Whatis the actualvalue of fragmentation ? • Isfragmentationbadorgood ? • DEMO
Indexes • Page splitting does not directly depend on index fragmentation • Read ahead is off when reading data from pages that are not neighbours • Data sizenecessary on physicalstorage and in memoryhigher for higherfragmentation • avg_fragmentation_in_percent is not enough for saving space - avg_page_space_user_in_percent gives additional details • Typically we assumethatat 5-20% of fragmentation REORGANIZE index and over 20% we should REBUILD
Conslusions • Gatheringinformationaboutfragmentationcan be slow • REBUILD canonly run on Enterprise systems with ONLINE = ON and need a lot of freespace • REOGRANIZE isslow but itis not blockingeven in standard edition • It canblock on SQL Server 2014 SP2 (fixed in CU) • How to speedup index fragmentationgatheringprocess ? • How to speedup REORGANIZE ?
Statistics • Settings • Auto CreateStatistics – verydangerous in case of unexpectedstatistics • Youcan’tseewhichstatisticsareused in query plan before SQL 2017 • Auto Update Statistics – unpredictible moment of statistics update • Causesrecompilations and possiblechanges in executionplans • For big tablesrefreshistoorare – especially for estimatorbefore SQL 2014 • Same for Auto Update StatisticsAsynchronously • Manual statisticsmaintenance • Periodical • Updatingstatisticscantakesometime • Defaultsample for update can be not enough for rarerecords • DEMO
Conclusions • Auto CreateStatistics Off • Auto Update Statistics Off • Periodicalstatistics update (eveneveryday) • How to getpropersample ? • How to reducetime of statistics update ?
How to getproper update statisticssample ? • Important for seeking with directvalues • Allseekvaluesshould be in the sample • For optimizer 120 orhighervalues out of statisticsare not of 0 cardinalitywhen no directvalueisused, but itmight be not enough
Typicalmaintenanceprocess • Iterateacrossindexes • For each index • Get index fragmenation • Defragmentifnecessary (REORGANIZE or REBUILD) • Update statisticsifnecessary
Proposedmaintenaceprocess • Iterateacrossindexes and makequeue • Spreadacrossnumber of threadseach index in queue • Single threadtask • Get index fragmenation • Defragmentifnecessary • Update statisticsifnecessary
Proposedmaintenaceprocess 4 1 2 3 5 6 7 8 9 10 11 12 13 TIME
Proposedmaintenace proces - queue 1 2 3 4 5 6 7 8 9 10 11 12 13
Proposedmaintenace proces4 threads timing 9 Thread 1 13 1 5 Thread 2 6 2 10 7 8 12 3 Thread 3 11 4 Thread 4 TIME
Proposedmaintenace proces4 threads timing vs Single thread Thread 1 13 9 5 1 Thread 2 2 6 10 7 8 12 3 Thread 3 11 4 Thread 4 TIME Single thread 4 1 2 3 5 6 7 8 9 10 11 12 13 TIME
Proposedmaintenace proces – queue – order by length 13 4 11 3 5 6 9 10 1 2 12 7 8
Proposedmaintenace proces4 threads timing – queueordered by length Thread 1 10 13 7 Thread 2 4 9 12 6 Thread 3 11 2 Thread 4 5 8 3 1 TIME
Proposedmaintenace proces4 threads timing – comparison Thread 1 10 13 7 Thread 2 4 9 12 6 Thread 3 11 2 Thread 4 5 8 3 1 TIME Thread 1 13 9 5 1 Thread 2 2 6 10 7 8 12 3 Thread 3 11 4 Thread 4 TIME
Proposedmaintenace proces • DEMO
Proposedmaintenance proces • Optimalnumber of threadscan be calculatedbased on real index operationtimes • The shortestpossibletimeis the time of the longest index to maintain • Morethreadsareapplicable as each index mainenanceis uniform to eachother
Proposedmaintenaceprocess – v2 • Iterateacrossindexes and makequeue • Spreadacrossnumber of threadseach index in queue • Single threadperform • Get index fragmenation • Store index fragmentation in the table • Removeindexesbelowfragmentationlimits • Spreadacross numer of treadseach index in queue • Single threadperform • Do index maintenance (fragmentation & statistics)
Proposedmaintenaceprocess – v2 • Have to measuretime for index fragmentationgathering and index maintenanceseparately • Should we split index defragmentation and statistics update ? • Thismight be version 3 – tryyourself • Isitbetter ?
Split maintenancetaskintoparts Read currentfragmentation Do index defragmentation Update statistics 4 1 2 3 5 6 7 8 9 10 11 12 13 TIME
Three stages – betterdecisions 9 2 12 1 6 12 5 Thread 1 1 3 10 2 9 10 Thread 2 6 2 7 9 13 5 13 3 4 13 Thread 3 11 7 8 4 Thread 4 5 7 8 11 11
Proposedmaintenaceprocess – summary • Parallelmaintenanceshortensprocessingtime • Eachmaintenancetasksplit to 2 or 3 stages for betterresourcesuse • Operations timegather for betterthreadtasksallocation • Processcan be fullyautomated
Paralleltasks – how to implement ? • Jobs • Need to createjobs • No performance balancebuilt-in • Service broker • Need to create service broker objects (queues, services) • Number of parallelexecutionsconfigurable and performance dependent • Createdrequeststreated as system requests (cannotkill) • CLR • Unsafeassembly with multithreadimplementation • Need to implementbreakingmethod (killingdoes not work)
Service Broker – paralleltasksimplementation DEMO T-SQL SourceQueue Destination Queue T-SQL T-SQL T-SQL
Service Broker - conclusions • Easy to implement • Somedifficulties in error handling • Parallelismcontrolled by server with MAX_QUEUE_READERS suggestion • Fasterthreadallocationcan be obtained with CLR
Conclusions • Database index maintenancecanconsumetime and resources • For 24/7 systemsthereare no technicalbreaks and everytime we expect plan stability • Typicalmaintenance proces is single-threaded • Makingmaintenancemulti-threadreducetime and makebetteruse from resources • The finalsolutioncan be parametrizedaccording to specific hardware abilities (number of cores, storagetype) • We have non-blocking, adaptive, multi-threadedmaintenaceprocessthatcanworkanytime
If we havemoretime… • BONUS DEMO
Just like Jimi Hendrix … We love to get feedback Please complete the session feedback forms
SQLBits - It's all about the community... Please visit Community Corner, we are trying this year to get more people to learn about the SQL Community, equally if you would be happy to visit the community corner we’d really appreciate it.
Q&A ThankYou for the attention