550 likes | 801 Views
DBA421 Building Very Large Databases with SQL Server 2000 (32/64bit). Gert E.R. Drapers Software Architect Customer Advisory Team SQL Server Development. SQL Server Development Customer Advisory Team. We invest in Large Scale SQL Server projects across the world
E N D
DBA421Building Very Large Databases with SQL Server 2000 (32/64bit) Gert E.R. Drapers Software Architect Customer Advisory Team SQL Server Development
SQL Server DevelopmentCustomer Advisory Team • We invest in Large Scale SQL Server projects across the world • Technical Assistance with design, performance tuning, etc. • Product group programs • We use our customer experiences to make a better product • We share the knowledge learned from these challenging implementations so that more can be built
Session Objective • Show that Large Scale projects using the Microsoft database platform is not unique • Give you some thoughts on clever design and implementation tips dealing with large scale SQL Server applications • Make you feel comfortable that it absolutely can be done Scaling up / out with SQL Server - It can be done!
Agenda • General comments on building large scale applications • Scaling up Large Scale OLTP • Best Practices • Case Study Example • Scaling up Large Scale Data Warehouse • Best Practices • Case Study Example • Scale out - Clarification • 64-bit – Awesome, but understand the value • Summary / Q & A
What is large scale? • Any database application that requires special configuration work outside default • Numbers do not mean anything but I always get asked so here are some guidelines: • TB+ size databases • 1000(s) of Database transactions / sec • 1000(s) of connected Users • Throughput of the IO subsystem exceeding 80 mb/sec • Greater than 8 way processor • Greater than 8GB Ram • Large Scale is when you think it is challenging and your vendor says it is a piece of cake. This is when you start worrying
Building Large Scale SQL Server applications • The General Implementation Rules Change slightly (true for all RDBMS) • It’s not quite auto-tuning • You need to understand your entire platform: Hardware, OS and Database Server configuration • You may have to change defaults that work great for 99% of other applications • DOP • Recovery Interval • Query / Index Hints • etc • Experience and know how is what makes large scale applications work • It’s not a piece of cake but it ABSOLUTELY CAN BE DONE!
Some interesting SQL Serverimplementations in Production • 25+ Terabyte or greater applications in production • Currently active on at least as many as we speak • 7+TB Telco Billing reporting system growing to 15TB • 3TB GIS mapping application growing to 10-20TB • 13TB Debt processing financial application • Currently working on system that will grow to 50 TBs • Many High Volume mission Critical applications in production on SQL Server • Trading application exceeding 60000+ Database transactions / sec • Credit card processing system processing 3000 authentications/sec • Banking application processing money transfers of greater than 40m trx / day while reporting is active
Agenda • General comments on building large scale applications • Scaling up Large Scale OLTP • Best Practices • Case Study Example • Scaling up Large Scale Data Warehouse • Best Practices • Case Study Example • Scale out - Clarification • 64-bit – Awesome, but understand the value • Summary / Q & A
OLTP Large Scale Tips and Best Practices • Eliminate Log Bottleneck • Eliminate ineffective Stored Procedure Recompiles • Eliminate Lock escalation that causes performance issues or deadlocks • Understand your scheduler and what hardware resources needed
Eliminate Log Bottleneck • Set log on its own drive or RAID 10 stripe set • Assign its own Controller or own LUN(s) if on SAN • Set controller or IO Subsystem cache for log as 100% write through (unless reading log with replication or a lot of trigger activity) • Monitor Waitstats to ensure log is not waiting – DBCC waitstatsDBCC SQLPERF(waitstats, clear) go • DBCC SQLPERF (waitstats)
Create Waitstats Script dbcc sqlperf(waitstats,clear) -- clear out wait statistics drop table waitstats create table waitstats (Wait_Type varchar(80), Requests numeric(18,1), Wait_Time numeric (18,1), Signal_Wait_Time numeric(18,1), timenow datetime default getdate()) declare @start int, @finish int select @start = 1, @finish = 10 while (@start < @finish) begin begin transaction insert into waitstats (Wait_Type, Requests, Wait_Time,Signal_Wait_Time) exec (‘dbcc sqlperf(waitstats)') commit select @start = @start + 1 waitfor delay '00:00:10' -- every 10 seconds End select * from waitstats order by Wait_Time desc
Eliminate ineffective Stored Procedure Recompiles • In some cases, stored procedure recompilation behavior adversely affects performance • Multiple recompilations • Entire stored procedure query plan recompiled (especially for very large stored procedures) • Stored procedure recompiles cause serialization, proc execution wait serially during stored proc recompilation • Recompile triggered by • Set options cause recompilation • DML/DDL mixed • TempDB • Table cardinality changes • Etc. • Customers miss recompilation aspect of performance • Possible Solution / Suggestion • Whitepaper on MSDN and KB Article: Q243586 • Execute proc [ keepfixedplan at stmt level ] • Statement level recompilation (Yukon) • Be on the look out of Recompilations using Perfmon.
Locking / Lock Escalation • Lock Escalation • SQL Server will escalate to a table lock when a specific percentage of memory is used up for locking or the system allocates 2500 locks(5000 in SS SP2). • Table Lock escalation saves on memory allocation but can cause of deadlocks and other locking problems • Resolution: • Force page locks which will allocate fewer locks • Or get clever like one customer did • Insert dummy record 9999 99 99999 999 999 • Open thread and grab update lock on dummy record • Next thread can not escalate because you can not escalate to a table lock when another lock is held on same table. • Worst case – There is a trace flag to shut it off, call your Microsoft support prior to using this option. • Monitor through normal locking tools: profiler, sp_who, sp_who2, sp_lock and sp_blockinfo
Locking script • Send email to gertd@microsoft.com for: • sp_blockinfo script • waitstats script • Other documentation
UMS UMS UMS UMS SQL Server Scheduler – understand it! 4 way CPU 5 BCP.exe jobs • Connections get assigned to UMS • UMS schedules across Processors • A connection stayed on UMS for life of thread • Two heavy threads will fight on the same UMS
UMS UMS UMS UMS SQL Server Scheduler – continued Task 1 1, 5 2 2 3 3 4 4 5 4 way CPU • Each load job takes 2 minutes to run • Task 2,3 and 4 each finished in 2 minutes • Task 1 & 5 took closer to 4 minutes • Using 4 threads instead of 5 would be a • better solution for this example. • 5) Monitor using DBCC SQLPERF (UMSTATS) 5 BCP.exe jobs
UMSSTATS example DBCC SQLPERF (UMSSTATS,clear) go waitfor delay '00:00:15' go DBCC SQLPERF (UMSSTATS)
OLTP Case Study – 60000+ database trx/sec • Financial trading messages: Bids / Asks coming off Tandem to message queue at a rate of 12000-15000/sec • Each message goes to a SQL Server stored procedure and makes approximately 5 database transactions • Simultaneously real time queries are reading from active data
Topology Brokers Microsoft SQL Server 8-way Dell Appl Tier Tandem Non-stop SQL Queries – real time Read from Queue SS Stored Proc SS Stored Proc Message Queue 6 Different Databases ….
OLTP Lessons learned • Challenges • Scheduling a mix workload evenly across Schedulers • Database Log to handle 60000+ database trx/sec • Real time reporting and loading data • Can be done with SQL Server 2000 • Multiple database logs • Read-only queries • Priority scheduling of connections
Agenda • General comments on building large scale applications • Scaling up Large Scale OLTP • Best Practices • Case Study Example • Scaling up Large Scale Data Warehouse • Best Practices • Case Study Example • Scale out - Clarification • 64-bit – Awesome, but understand the value • Summary / Q & A
Data Warehouse Lessons Learned • Loading large quantity quickly • Maintaining very large databases • Optimizing TempDB • Optimizing your IO subsystem
Data Loading Strategy For VLDB • Bulk Insert without indexes or minimal indexes seems to be fastest (hundreds of thousands rows/sec) but not applicable to larger systems with incremental data loads • Bulk Insert with single NC index fairly linear. • Bulk Insert with Clustered index less linear due to fragmentation • Parallel Bulk Insert can be fastest but not linear with sorted data and partitioned load files • No contention using multiple Bulk Insert (one per table) if physical model works for data access (e.g. horizontal partitioning)
Maintenance Tips: Reorg & Create Index • Run DBCC ShowContig and look for Scan Density(logical extent density) and Avg Page Density(Avg free bytes/page). Below 30-50% then a Reorg is recommended. • Run Index Defrag if your data in not interleaved (that is single table/index in its own File Group). • Run DBCC Reindex if you want all clustered and non-clustering indexes rebuilt • Run Create Index with drop existing if you only want clustered index rebuilt and Index defrag is not helping • Create Index runs fastest with more memory
Index Defrag with Interleaved data Single File Group Tab A….page Tab B….page Tab A….page Tab A….page 1………………….5 16………………20 6……………….10 Tab B….page Tab B….page Tab A….page Tab B….page 11……………..15 Tab A….page Tab B….page 21……………….25 Run DBCC Index Defrag (Tab A)
Index Defrag with Interleaved data… Single File Group Tab A….page Tab B….page Tab A….page Tab A….page 6………………10 16……………20 1…………………..5 Tab B….page Tab B….page Tab A….page Tab B….page 11…………..15 Tab A….page Tab B….page 21……………….25 • Start with first extent and swap with next smallest extent • If pages not at fill factor then will fill up to fill factor • If pages are greater than fill factor it will NOT take away records to • meet fill factor • Will not reorganize other objects in file group therefore data will not be • contiguous following Index Defrag
Index Defrag with Interleaved data… Single File Group Tab A….page Tab B….page Tab A….page Tab A….page 11…………….15 6………………10 1……………….5 Tab B….page Tab B….page Tab A….page Tab B….page 16……………..20 Tab A….page Tab B….page 21……………….25 • When completed data is in correct • Data may or may not be contiguous • Free space is used up but not given back • DBCC Reindex will put data back in contiguous order but takes longer • to run and is more intrusive
TempDB slows down when busy! • Challenge • Performance degradation during peak load • Many concurrent users creating temporary tables • Performance will start to degrade • Sysprocesses shows locks on DBID #2 (TempDB) • Resolution • More files for TempDB file group • Place TempDB across more spindles • Could possibly use trace flag to disable single page allocation but please please please call Microsoft support before turning this trace flag on.
Challenge: New to SAN technology • Never have worked with SAN technology can be challenging • Does the data layout rules change? • What does the cache really buy me? • How do I monitor? • How do I determine if the problem is SAN or database? • Some thoughts: • Test your SAN throughput out prior to database activity. Use IOMETER or just copy large files • You should be getting approx. 120mb/sec throughput per channel • If heavy updated system then make cache more write through vs read. This will significantly help log writes (Should get 1 ms / write – Avg Disk Writes /sec) • Still make sure data is spread across as many drives as possible. Individual disks are still the slowest component for database activity.
Case Study Data WarehouseLarge Telco processing call detail records (CDRs) • Loads into SQL Server 350 million call detail records / day. Bulk loaded • Must keep a history • Reporting done all day long based off daily, weekly, monthly quarterly and yearly intervals • Very partitionable by period of time • 5-7 TB database • This was based off a prototype, project being developed.
Case Study – Data Warehouse design • 3 types of databases • Type 1 contains 24 - 1 hour tables • Type 2 contains 7 - 1 day tables • Type 3 contains 7 – 1 week tables • Online consolidation was much more difficult, so all 3 are built in parallel. This is good design if Hardware has head room • Data is loaded without indexes and then indexed • Data made available to users once indexed • Yesterday, Last Week and Last 7 weeks are available while Today and This week are being loaded.
Challenges for data warehouse case study • Real time reporting • Partition management – History • Maintenance of very large databases • Auto Updating statistics on new high volume data coming in.
Last week Week n-7 This week Last week Yesterday This week Yesterday Today Today Design using 1 server SQL Server View PView
Last week Week n-7 This week Last week Yesterday This week Yesterday Today Today Design using 4 servers
Design using 7 servers Last week Last week This week Yesterday This week Yesterday Yesterday Today Today Today
Case Study Data Warehouse – Summary • Large Database with Very Large requirement of number of rows loaded / day. • Loaded in Parallel to different tables with Hardware being the determining bottleneck • Partitioning using a data value of time • Could be partitioned using Partitioned Views or partitioned within application • Data partitioning allowed for easy scalability to multiple databases, instances or servers with minimal changes to code • Maintenance decisions weren’t made because of prototype nature of the project. Being made now
Agenda • General comments on building large scale applications • Scaling up Large Scale OLTP • Best Practices • Case Study Example • Scaling up Large Scale Data Warehouse • Best Practices • Case Study Example • Scale out - Clarification • 64-bit – Awesome, but understand the value • Summary / Q & A
What is Scale Out? • Multiple Nodes • Evenly distributed data as in data partitioning • Data Warehouse Telco example is data partitioning • Distributed Partitioned Views is data partitioning • Evenly distributed workload/data as in function partitioning • Online shopping is function partitioning • Break up unrelated functions across resources vs breaking up related data
Scale Out Case Study: 13TB Data Analysis System • Provides bad loan/credit collection service and reporting (Online and batch). • Designed to create an environment where credit grantors, debt buyers, and collection agencies can meet their business objectives • Large volume of data inserted every night from several sources • Heavy analytical reporting/data mining performed all day long to predict and quantify the likelihood of collecting each delinquent account in a client’s portfolio. • <24 hour turn around to customer requests
Scale Out Case Study:Challenges • Manage 13TBs of data • Manage large files from several external customers / day • Load millions of rows from multiple files and run batch reporting at the same time • How do I manage different SLAs from different customers with same system.
Scale Out Case Study:Function Partitioning Daily Customer Source files Account Function Tracking DB Individual Account Databases Function X Tables Common Database Function Y Tables Common Function Procedures - Data Function Z Tables
Function Partitioning • Partitioning is a VERY good idea • Partitioning by business function is very logical • Application must developed to partitioned data versus using database technology to access partitioned data. NOT DIFFICULT • Putting all the data in one single database doesn’t make you a hero. • No matter how you look at it 13TB of data is 13TB of data used to solve a business problem.
Agenda • General comments on building large scale applications • Scaling up Large Scale OLTP • Best Practices • Case Study Example • Scaling up Large Scale Data Warehouse • Best Practices • Case Study Example • Scale out - Clarification • 64 bit – Awesome, but understand the value • Summary / Q&A
64-bit – Is it an upgrade lock? • What is the hype of 64-bit? • Produced world record benchmarks • New advancements in CPU – Itanium • It begs the questions: • Should you automatically upgrade? • Is it a performance guarantee? • Is it an easy upgrade?
64-Bit Clear Winners • OLAP Applications with large dimensions • Breaks through the 3GB limit • Essential for dimensions with millions of members • Also can leverage large file system cache • Relational applications requiring extreme memory (> 32GB) • VLDBs with random access patterns that can be ‘cached’ to DB buffers for performance • Any DB requiring > 64GB buffers • Alternative to 32-bit apps using AWE • No pressure on ‘real’ memory • No overhead for mapping AWE pages
64-bit winners… • If you need to keep procedure/statement cache in the virtual address space • If you need to keep the page headers in the virtual address space. The more memory we use via AWE, the more page headers we have to keep and the less space for real data in the virtual address space • If you need to keep memory associated with a server side cursor in virtual address space • Calculating memory for hash joins and sort operations we only can calculate with the max of the virtual address space. We can not use AWE memory for such operations. Example: Spill out in TempDB • Going beyond 16GB, we have to reduce the virtual address space to 2GB. This is contra productive since we get more data to be held in virtual address space (page headers). • Concurrency issues of AWE under W2K got improved with W2K3. Problem showed up with 32way hardware.
Some realities! • Documentation states that Itanium twice the CPU power/speed as Xeon although slower clock speed • You pay premium $ for a slower Itanium compared to Xeon • CPU-bound SQL relational workloads will see performance improvement on same number of CPUs BUT, this does NOT mean 4way Itanium is equal to 8way Xeon. • Very Large Memory could take a long time to shutdown due to a system checkpoint with a lot of data to flush. Recommendation – Keep system checkpoint at small interval (example: 1 minute).
Takeaways • For relational DBMS workloads, 64-bit is not a slam-dunk • Often better to have more CPUs • Often better to have faster CPUs • Qualify before recommending. Look for • Memory-intensive workloads • Beyond what can be gracefully handled with AWE today, e.g. >> 32 GB • Large-Scale OLAP workloads
Agenda • General comments on building large scale applications • Scaling up Large Scale OLTP • Best Practices • Case Study Example • Scaling up Large Scale Data Warehouse • Best Practices • Case Study Example • Scale out - Clarification • 64-bit – Awesome, but understand the value • Summary / Q & A
Summary • We can build VLDB systems with SQL Server today and we are doing it. • Design Wins Program may be able to assist. Send email to your local Microsoft representative or dsignwin@microsoft.com • It is not auto-tuning , auto configuration. It will take effort from you. Your Microsoft MCS can help • Having this knowledge will help you be more successful while building large scalable systems