530 likes | 746 Views
DAT309. Implementing Scale-Out Solutions with Microsoft SQL Server 2005. Don Vilen Program Manager SQL Server Microsoft Corporation – Redmond. Who Am I?. Program Manager on SQL Server’s Storage Engine Working on SQL Server at Microsoft for 12+ years
E N D
DAT309 Implementing Scale-Out Solutions with Microsoft SQL Server 2005 Don Vilen Program Manager SQL Server Microsoft Corporation – Redmond
Who Am I? • Program Manager on SQL Server’s Storage Engine • Working on SQL Server at Microsoft for 12+ years • Responsible for Database Mirroring and High Availability • Previously: • PM for Filestream, other features • Training for PSS Engineers on SQL Server around the world • DonV@microsoft.com
Agenda Technologies to Implement Scale-out Which Technology to Use? Classes of Data and Scale-out Deployment Scenarios Use Cases
Scaling Database Systems • This presentation is about scaling • Second 10,000 users as fast as first 10,000 • With twice as much horsepower – or less • Two ways to scale • Scale up: bigger hardware, software • Scale out: more small boxes • Scale-out has potential savings • Scale-up: redundant big server • Scale-out: 100% redundancy not required
Scale-Out For Everyone? • Scale-out is not the answer to every problem • Similar to dividing processing among threads • If scale is not required, solution will just be: • Unnecessarily more complex • Slower • Some scaling technologies require changing • Programs • Procedures • Best to plan this into the design ahead of time • We’ll assume you’ve addressed basic tuning • Appropriate indexes, stored procs, etc.
Linked Servers and Distributed Queries • Define Linked Servers and connection attributes • Use 4-part names to reference <server>.<database>.<schema>.<object> • Query optimizer will split query or DML into appropriate queries to be sent to each linked server • Each linked server processes its query and returns result set • Result sets are combined / joined • Distributed transactions for DML
Linked Servers and Distributed Queries Executing a Query 3. Queries run asynchronously on involved linked servers 2. Sends appropriate query / DML to linked servers, eliminating those with no relevant data 4. Data is returned by linked servers and combined 1. User issues query / DML with DPV
Distributed Partitioned View – DPV • Define sub-tables with appropriate constraints on the various Linked Servers • Create a UNION-ALL view SELECTing from all the sub-tables • Query optimizer will split query or DML into appropriate queries to be sent to each linked server, with dynamic elimination of uninvolved partitions • Each linked server processes its query and returns result set • Result sets are combined / joined • Distributed transactions for DML
Distributed Partitioned View – DPV Setting it up 1 - 999 1000 - 1999 2000 - 2999 1. Create tables with range constraints 2. Create linked servers 3. Create view with UNION ALL
Distributed Partitioned Views Executing a Query 1 - 999 1000 - 1999 2000 - 2999 3. Queries run asynchronously on involved linked servers 2. Sends appropriate query / DML to linked servers, eliminating those with no relevant data 4. Data is returned by linked servers and combined 1. User issues query / DML with DPV
Distributed Partitioned View How it Provides Scale-out 1 - 999 1000 - 1999 2000 - 2999 3000 - 3999 • Performs the work across involved nodes asynchronously • Add additional nodes as needed
Distributed Partitioned Views • Available since SQL Server 7.0 • SQL Server 2005 adds the asynchronous processing • Usable for read or write • Requires MSDTC for distributed transactions • Considerations for scalability • All queries and updates are affected • Response is determined by the slowest part • Locks are held for whole transaction • Not feasible across security/organization boundaries
Scalable Shared Database – SSD • One copy of database attached to multiple instances • New in SQL Server 2005 – Enterprise Edition only • Requires a SAN • Database itself is read-only • Refresh on a schedule • Rolling refresh possible • Sweet spot is connection by 2 to 8 instances
Scalable Shared Database – SSD Setting it up 1. Create database, fill with data, and detach 2. Set volumes Read Only 3. Attach the same database to multiple instances 4. Periodically attach refreshed database to instances, perhaps via SAN operations
Scalable Shared Database – SSD Executing a Query User queries on multiple instances use same reference data from single copy
Scalable Shared Database – SSD Refreshing the Data 1. Create a Clone of hot data, with recovery Hot Data Clone 2. Set Clone volume to Read Only Vn+1 Vn 4. Refresh Vn, and create Vn+2 3. For each node: Detach Instance from Vn, and attach Vn+1
Scalable Shared Database How it Provides Scale-out • Multiple instances access the same database files • Add additional nodes as needed
Setting Up SSD • Build process • Unmask SAN unit number to make volume accessible • Mount volume as read-write with DISKPART • Build or refresh one copy of the database • Attach process • Detach the database from the SQL Server instance that built it, using the DETACH command • Mark each volume as read-only by using DISKPART and dismount the disk volume from the production server • Unmask the SAN logical unit number to make the database available to each reporting server • Mount the disk volume on each reporting server by using DISKPART • Attach to one or more instances of SQL Server
Replication • Provides for distribution of all or subset of data • Available with various capabilities since SQL Server 7.0 • All use Publisher – Distributor – Subscriber model • All incur latency; data changes appear some time, perhaps seconds, minutes, hours, later • Snapshot Replication • Merge Replication • Transactional Replication
Snapshot Replication • Periodic refreshing of whole sets of data • Equivalent to BCP-out from Publisher and BCP-in to Subscriber(s) • Perhaps some rows haven’t changed at all; others have changed perhaps millions of times Publisher / Distributor Subscriber(s) Mechanism: BCP to copy whole sets of data
Merge Replication • Periodic exchange of changed rows in both directions • Changed rows are tracked; net changes are sent • No guarantee of transactional consistency • Can update in multiple places • Built-in conflict detection Publisher / Distributor Subscriber(s) Mechanism: Track changed rows, then exchange net changes
Transactional Replication • Periodically sends individual changes, maintaining transactional integrity • Changes are read from log, converted into individual T-SQL statements, and sent preserving order and transaction grouping • Guarantees transactional consistency Publisher / Distributor Subscriber(s) Mechanism: Translate each log record to T-SQL, send in same order, and with similar transactions
Peer-to-Peer Replication – P2P • Variety of transactional replication • New in SQL Server 2005 – Enterprise Edition only All Publish their ‘owned’ data and subscribe to others’ data
Peer-to-Peer Replication – P2P • Variety of transactional replication • New in SQL Server 2005 – Enterprise Edition only • Allows ring-type replication configuration • More flexible than hierarchical topologies • Allows Read and Write operations • More replication paths than bus-type • No built-in conflict detection • Each row must have an "owner"
Peer-to-Peer Replication – P2P How it Provides Scale-out • Clients work with a server with a ‘local’ copy • Add additional nodes to increase capacity • Also increases availability All Publish their ‘owned’ data and subscribe to others’ data
SQL Server Service Broker – SSB • Transactional messaging, asynchronous processing, and disconnected systems • Separates input from execution • Routing among instances is built in • Provides service-oriented infrastructure • Available in all SQL Server 2005 editions • Even SQL Server Express • Requires application changes / programming
Service Broker – SSB Setting it up 1. Enable Service Broker 2. CREATE QUEUE 3. Rather than performing all the processing, it SENDs a message 4. Message is routed to receiver who RECEIVEs it from the queue and processes it
Service Broker – SSB How it Provides Scale-out • Distributes the work • Processing is done elsewhere and/or later • Can specify an Activation procedure and number of instances • Can further divide the work across multiple services • Add more resources as needed
Query Notifications – QN • Notify my client / cache when row is changed • Notification only; client must refresh cached data • Implemented in-server using: • Indexed View-based notification technology • Service Broker for communications • Requires SQL Server 2005 – any edition • Requires "new" client stack • Used in ASP.NET "Cache Sync" 2. Data is changed on server 1. Client caches data and registers for notification Client cache 3. SSB message notifies client of a change to result set
Query Notifications – QN How it Provides Scale-out • Intelligent cache notifications avoid continuous polling to check for changes • Effectively “scales-out” to the clients Client cache
Data Dependent Routing – DDR • More architectural concept than feature • Not built in to SQL Server 2005 • Can be implemented: • By application • With SSB using built-in routing service with custom code • With middle-tier component • Can be very helpful with: • Transactional replication • Distributed Partitioned Views
Agenda Technologies to Implement Scale-out Which Technology to Use? Classes of Data and Scale-out Deployment Scenarios Use Cases
Data Usage-Based Scale-Out • Scaling choice is based on data usage • Classify the data • Determine the database deployment topology • Determine the application topology • Choose the appropriate technology
Classifying Data • Four main data classifications • Resource data: customers • Traditional "data processing" data • Reference data: catalogs, price lists • Lookup tables • Service interaction data • Original order forms • Activity data • Pick lists • Shopping carts
Traditional Deployment • Single-tier or client-server • All data together in database • All data treated the same • All data updated directly • Service interaction is part of transaction
Deployment in Tiers • Multi-tiers • Reference Data can be cached and replicated • Activity and Service Interaction coupled • Reside in application servers • Direct interaction between activity, service, and resource • Multi-operation transaction
Deployment with Services • Some data separated into services • Activity data internal but asynchronous • Not tightly coupled to resource data • Managed through messaging • Service interactions can be disconnected • Managed through messaging • Some data replicated • Traditional transactional replication • Smart, active cache • Only resources need synchronous processing
Scaling and Data Types • Techniques differ with data • Reference data • Scalable Shared Database • Snapshot replication • Query notifications • Activity data • Dynamic partitioning • Data dependent routing • Service interaction data • Inherently asynchronous • Resource data • Static partitioning • Peer-to-peer or bi-directional replication
Storage and Data Types • Techniques differ with data • Reference data • Master copy in central database • Replicas online • Activity data • Keep copies for activity tracking, trending, mining • Service interaction data • Keep EXACT copies of request and response • For retry and recovery • Sometimes a legal requirement • Resource data • Divide processes • Updates can be async with other categories
Use Cases • Distributed Partitioned Views • “SQL Server Megaservers: Scalability, Availability, Manageability” (Jim Gray and Richard Waymire)http://www.microsoft.com/technet/prodtechnol/sql/2000/plan/ssmsam.mspx • Scalable Shared Database • Corporate Phone Book, Product Catalog • Peer-to-Peer Replication • “Inside Microsoft.com: Managing SQL Server 2005 Peer-to-Peer Replication” (David Lindquist)http://www.microsoft.com/technet/technetmag/issues/2006/07/InsideMSFT/default.aspx • Service Broker • “Inside MSDN: Building the MSDN Content Aggregation System”(John Mollman)http://msdn.microsoft.com/msdnmag/issues/06/07/InsideMSDN/default.aspx • Query Notifications • Cache information about server status, Windows mobile server update
Summary • Scale-out more achievable with SQL Server 2005 • Multiple features geared toward scale-out • Service-oriented, replicated, or distributed • Which technologies to use depends on: • Deployment scenarios • Whether app or process can be changed • Data classes used • Scale-out technologies can be combined • Data Dependent Routing useful adjunct to all
Resources “Scaling Out SQL Server 2005” Roger Wolter http://msdn2.microsoft.com/en-us/library/aa479364.aspx “Planning, Implementing, and Administering Scaleout Solutions with SQL Server 2005” (Bob Beauchemin) http://www.microsoft.com/technet/prodtechnol/sql/2005/implementingscaleout.mspx Bob Beauchemin's Blog: http://www.SQLskills.com/blogs/bobb “Architecting Service Broker Applications” (Roger Wolter) http://msdn2.microsoft.com/en-us/library/aa964144.aspx “Achieving Scalability and Availability with Peer-to-Peer Transactional Replication” (Michael Blythe) http://www.microsoft.com/technet/prodtechnol/sql/2005/p2ptranrepl.mspx Scalable Shared Databases Are Supported by SQL Server 2005 http://support.microsoft.com/default.aspx?scid=kb;en-us;910378 “Service-Oriented Data Architecture: App-Server:Lite?” (David Campbell) http://research.microsoft.com/research/pubs/view.aspx?tr_id=983 “Data on the Outside vs. Data on the Inside: An Examination of the Impact of Service Oriented Architectures on Data” (Pat Helland) http://msdn2.microsoft.com/en-us/library/ms954587.aspx
Resources Technical Communities, Webcasts, Blogs, Chats & User Groups http://www.microsoft.com/communities/default.mspx Microsoft Learning and Certification http://www.microsoft.com/learning/default.mspx Microsoft Developer Network (MSDN) & TechNet http://microsoft.com/msdn http://microsoft.com/technet Trial Software and Virtual Labs http://www.microsoft.com/technet/downloads/trials/default.mspx
SQL Down Under Code Camp • Biggest SQL Server community event in Australia • 14 SQL Server sessions over two days • Special guest presentation from Kevin Kline • President of PASS, Director of Technology – Quest Software • When? • Weekend of the 13th and 14th of October • Where? • Charles Sturt University, Wagga Wagga • What does it cost? • This event is FREE • http://www.sqldownunder.com