370 likes | 796 Views
The need for speed. SharePoint Optimisation and recovery from a database perspective. By Tom Bizannes Microsoft Certified Professional MacroView Business Technology Level 12, 171 Clarence St, Sydney, Australia +612 9249 2700. Tom Bizannes. A Database Guy caught in
E N D
The need for speed SharePoint Optimisation and recovery from a database perspective By Tom Bizannes Microsoft Certified Professional MacroView Business Technology Level 12, 171 Clarence St, Sydney, Australia +612 9249 2700
Tom Bizannes A Database Guy caught in the Land of theLost SharePoint Working with the leading Document Management Software for SharePoint (WISDOM)
Why • Because Microsoft Released a white paper on Optimising SharePoint which verified many of our findings – see reference at the end
What we will cover • Brief Overview of SharePoint Infrastructure • SharePoint Recovery - Backing up / Restoring • Optimizing – the need for speed! • Extras • Integrating Reporting Services 2008 on IIS7 • Reporting Services and SharePoint Lists
Recovery • Full Recovery • SharePoint Central Admin • Stsadm and psconfig • Sql Server • Other Tools • Granular Recovery • Stsadm • Versioning • Recycle Bin – Did you know it has two levels? • Third Party Tools
Backing Up For restoring from failure or corruption • Sql Server • SharePoint Central Admistration • Windows and stsadmin..via scripts…. • SharePoint Designer • Third Party Tools For item level restore • Recycle Bin and Versioning • Stsadm scripts • Third Party Tools
Normal Sql Server and Windows • If MOSS 2007 then backup the config database as well. • There is a fix if you forgot to backup the config…but you will pull your hair out first! • Know what service pack and hot fix you are up to with SharePoint (Schema related) • With the recycle bin and versioning, you may not need a more granular backup..
Windows and stsadmin.. • Can get clever with scripts to backup various sites. • More granular if you have lots of scripts against different sites. • SharePoint Designer creates the same sort of backup as stsadmin but needs to be run manually.
Back Up and Restore by Using Stsadm When you back up by using the Stsadm command-line tool, you can back up individual aspects of your SharePoint Products and Technologies deployment. For example, you can back up an individual site collection or you can back up the entire farm. To export sites from your SharePoint Products and Technologies deployment, you use the following Stsadm command. • Stsadm –o export –url <URL> -filename <FileName>.cmp • To back up a site collection, you must use the following stsadm command. • stsadm –o backup –url <URL of the site collection> -filename <Name of the backup file> • To back up an individual database, Web application, or the entire farm, you can use the following Stsadm command. • stsadm –o backup –directory <UNC path or local drive> -backupmethod <Full or Differential> • Restore • To import sites to your SharePoint Products and Technologies deployment, you use the following Stsadm command. • Stsadm –o import –url <URL> -filename <FileName>.cmp • To restore a site collection, you must use the following Stsadm command. • stsadm –o restore –url <URL of the site collection> -filename <Name of the backup file> • To restore an entire farm you can use the following Stsadm command. • stsadm –o restore –directory <UNC path or local drive> -restoremethod <overwrite or new>
Third party tools • EMC Backup Manager for SharePoint • Uses the sql backup to restore a file or site etc • Microsoft Data Protection Management • Can do incremental backups every 15 minutes etc • Doc Avenue • Great for restoring a site or document library • Very granular • Idera – Point Backup
Tuning • Indexes • IIS • Database • Fragmentation • Database Files
Need to work together The SharePoint administrator needs to work together with The Microsoft Sql Server administrator and the Windows Network administrator
What do you need to know and what do you need to get the others to do?
Typical scenarios • 2 million documents, 2 Terabytes of data, 400 staff in head office, 50 in 4 remote branches • 100,000 documents, 20 staff, 100Mb of documents and growing, one branch with 100 external clients logging in • 20 millions documents, 20 terabytes of data, 4 countries and 50 branches around the world
Our analysis • Databases get fragmented very quickly • Searching is the biggest CPU hog • Virtualising SharePoint is great but beware of Virtualising Sql Server ….
Tuning - What you need to know? • IIS Settings • Setting indexed columns • The structure of the SharePoint implementation – Good planning and structure makes SharePoint more responsive • Database tuning = very important
Optimise your Search Indexer • There are many factors involved in the SharePoint crawling process that can impact indexing performance. There are also some steps you can take to improve that. Here are the common causes and their resolution: • Indexing Performance is set at reduced - common mistake on the configuration screen for the index service. See Central Administration > Operations > Services on Server > Office SharePoint Server Search Service Settings and set to Maximum. • Number of Connections - by default the indexer will run a limited number of simultaneous threads (6 usually). This can be increased manually by adding specific Crawler Impact Rules for each host. You can really improve speed by setting a large file server to 32 connections. But watch your network for bottlenecks. • Crawled systems are slow or hosted on remote networks. - not a lot to be done here, except by moving those file closer. • Overlapping Crawls - SharePoint gives priority to the first running crawl so that if you already are indexing one system it will hold up the indexing of a second and increase crawl times. • Solutions: Schedule your crawl times so there is no overlap. Full crawls will take the longest so run those exclusively. • IFilter Issues - the Adobe PDF IFilter can only filter one file at a time and that will slow crawls down. • Using a PDF filter from pdflib.com or Foxit • Not enought Memory Allocated to Filter Process - you can increase the memory allocation by adjusting the following registry keys • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager: set DedicatedFilterProcessMemoryQuota = 200000000 Decimal • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager: set FilterProcessMemoryQuota = 200000000 Decimal • Bad File Retries - there is a setting in the registry that controls the number of times a file is retried on error. This will severly slow down incremental crawls as the default is 100. This retry count can be adjust by this key: • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Global\Gathering Manager: set DeleteOnErrorInterval = 4 Decimal • General Architecture Issues - Ensure that you have at least 2 Gig of free memory available before your crawl even starts and that you have at least 2 real processors available • Disk Health - the nature of the indexing process causes extensive fragmentation of the file system for both the index server and the database server. Schedule defrags routinely and after all full crawls. Ensure you have enough diskspace always. • Run 64 bit OS - school is still out on this one, i personally haven't seen must difference as long as there is enough memory and the same processor types, but MS recommends this for large deployments.
More questions for the SharePoint Administrators • Versioning • This impacts on the size, but makes restoring less of an issue • Can delete minor versions and do all sorts of other things if the database gets too big. • Auditing • This can grow out of control • Have you set indexed columns on large document libraries for performance? • How many site collections do you have and how big will they grow? • Can set a separate database for each one • E.g. might have historical data on one etc • Maybe a site collection for clients A to M and another for clients N to Z
Indexed Columns in a SharePoint Library or List • every list item that has an indexed column will create a new row in the NameValuePair table.(The “NameValuePair_Latin1_General_CI_AS” table for the English Language version of SharePoint) • The table also needs to be updated whenever indexed columns change their value or when items are added or deleted. E.g. Having a list with 100000 items and an index column for instance means that you also add 100000 items to the NameValuePair table. And - every time you update your list - NameValuePair need to be updated as well.
The Databases • Main Database is Wss_content • Know what service packs and hot fixes have been applied if you want to restore! • Set a 100G limit per site collection? (This is as per the Microsoft Whitepaper ) Someone suggested 300G is the new 100G limit The limit all depends on your infrastructure
What version are you running? Go into Control Panel and show the version field
Run this against your Sql Database SELECT [VersionId] ,[Version] ,[Id] ,[UserName] ,[TimeStamp] ,[FinalizeTimeStamp] ,[Mode] ,[ModeStack] ,[Updates] ,[Notes] FROM [SharePoint_Config].[dbo].[Versions] WHERE VersionId = '00000000-0000-0000-0000-000000000000' ORDER BY Id DESC
Relevant Versions for V3 / 2007 Using SharePoint Central Administration Web site SharePoint HTML Site Settings admin pages or IIS Manager, on the web sites properties HTTP Headers tab, virtual servers once extended will show the following version numbers: Service Pack 2 - 12.0.0.6421 MOSS 20071 or WSS 3.0 Cumulative update (KB956056 & KB956057) 12.0.0.6327MOSS 20071 or WSS 3.0 Infrastructure Update (KB951695 & KB951297) 12.0.0.6318MOSS 20071 or WSS 3.0 SP1 12.0.0.6219 MOSS 20071 or WSS 3.0 RTM 12.0.0.4518
Always backup your Content Store • The main database is wss_content • You can create a content store per site collection. E.g. wss_content_hr, wss_content_admin etc • You can get your SharePoint Administrator to set up an alert via SharePoint Administration when it gets close to the capacity you set.
Other networking things • Enable IIS Compression • As a general best practice, the SharePoint Operations team verified that IIS compression for static content was enabled on front-end servers. Enabling static compression is especially helpful for serving content to users over slower links • Get fast Network cards on all servers (1Gb)
Service packs and hot fixes • Searching had a few bugs that were fixed with Service Pack 1 • The crawler never finished if indexes were rebuilt • Service Pack 2 had even more fixes for searching • Always note what service pack and hot fix you are up to as you will need these if you ever restore or the schemas will cause issues etc
Optimising the databases • Set a fill factor on all indexes to 70% Q: Where do you set a default fill factor? • Use indexed columns on large document libraries for performance (For your SharePoint Guy) • Create different site collections and/or document libraries on different content databases • Separate Databases – see following
Separate Tempdb first • Office SharePoint Server 2007 farm performance can be significantly impeded by insufficient disk I/O for the tempdb. To avoid this issue, allocate dedicated disks for the tempdb. If a high workload is projected or monitored — that is, the average read operation or the average write operation requires more than 20 milliseconds (ms) — you might need to ease the bottleneck by either separating the files across disks, or by replacing your disks with faster disks. • For best performance, place the tempdb on a RAID 10 array. The number of tempdb data files should equal the number of core CPUs, and the tempdb data files should be set at an equal size. Count dual core processors as two CPUs for this purpose. Count each processor that supports hyper-threading as a single CPU.
Separate and prioritize your data among disks • When prioritizing data among faster disks, use the following ranking: • Tempdb data and transaction logs • Database transaction log files • Search database • Database data files • In a heavily read-oriented portal site, prioritize data over logs. • Ideally, place the tempdb, content databases, and SQL Server 2005 transaction logs on separate physical hard disks.
Separate and prioritize your data among disks - continued • Only create files in the primary filegroup for the database. • Distribute the files across separate disks. • Create data files of equal size. • Separate database data and transaction log file across different disks. If files must share disks because the files are too small to warrant a whole disk or stripe or you have a shortage of disk space, put files that have different usage patterns on the same disk to minimize simultaneous access requests. • Consult your storage hardware vendor for information about how to configure all logs and the search databases for write optimization for your particular storage solution. • Allocate dedicated spindles for the search database.
Microsoft’s Conclusions p1 Best practices for the front end include: • Run IIS version 7.0 on 64-bit servers Memory and CPU are common performance optimization factors for SharePoint Server. Using 64-bit hardware increases the amount of usable memory, which helps to maintain a healthy system state for worker processes. • Use a front-end and back-end NIC configuration for IIS During peak load times, as many people access SharePoint sites, the NIC traffic increases. Using dedicated NICs for connections to the SQL Server back end and the clients provides better load distribution. Using dedicated NICs also provides more-accurate statistics and helps with troubleshooting traffic congestion issues by segregating the front-end and back-end traffic. • Load balance client traffic The SharePoint Operations team uses NLB for balancing client traffic. It is a best practice to load balance incoming traffic for optimal user experience and server utilization. • Use IIS compression for static content The SharePoint Operations team ensures that static compression is enabled to conserve traffic and server resources. • Enable caching Page output caching on front-end servers reduces CPU utilization on front-end servers by storing compiled ASP.NET pages in RAM. Enabling this setting resulted in performance gains for the SharePoint Operations team. BLOB caching helps to relieve load on back-end servers by caching static content and not accessing databases when it is requested.
Microsoft’s Conclusions p2 Best practices for the back end include: • Limit database size to enhance manageability When databases grow, they can become less manageable for backup and restore operations, or for troubleshooting. The SharePoint Operations team uses a 100-GB limit. • Allocate storage for versioning and the recycle bin When designing the environment, an organization should consider business needs, such as versioning, and ensure that adequate disk space and I/O are available to accommodate them. • Use quota templates to manage storage Microsoft IT uses standardized configuration templates in all possible and practical scenarios, including quotas. Using quota templates helps preserve a standard environment, which reduces administrative overhead. • Manage large lists for performance Having large lists by itself is not necessarily a performance issue. When SharePoint Server renders the many items in those lists, that can cause spikes in render times and database blocking. One way to mitigate large lists is to use subfolders and create a hierarchical structure where each folder or subfolder has no more than 3,000 items. • Separate and prioritize data among disks and create disk groups for specific data Because available disk I/O throughput is so important for optimal SQL Server performance, identifying the read/write patterns of services and dedicating SAN LUNs to them results in better performance than using many service types with the same disk group. The SharePoint Operations team takes this idea a step farther and uses dedicated partitions for data.
The tools Microsoft talks about • Event Viewer This tool is especially useful for understanding the underlying behavior by evaluating application errors and warnings, or investigating system events that occur before, during, and after a performance incident. • Dump file analysis Analyzing dump files is an advanced troubleshooting and analysis approach that provides low-level information about critical system errors and memory dumps. It enables the SharePoint Operations team to examine the data in memory and analyze the possible causes of such issues as memory leaks and invalid pointers. • System Monitor The SharePoint Operations team uses tools such as Event Viewer and dump file analysis to investigate specific incidents and performance issues. The team uses System Monitor in the Windows Server® 2003 operating system (called Performance Monitor in Windows Server 2008) for establishing a performance baseline, tracking trends, and compiling data on resulting performance after making changes. • SQL Server Profiler This tool is a graphical user interface to SQL Trace for monitoring an instance of SQL Server Database Engine or SQL Server Analysis Services. Microsoft IT and other teams use this tool to evaluate SQL Server performance aspects such as query times, stored procedure run times, and deadlocks. This tool is especially useful for analyzing the underlying calls to SQL Server databases that are housed on the storage area network (SAN). • Custom tool for client-based URL ping The SharePoint Operations team created a custom tool that recorded the time to first byte for URLs hosted on SharePoint servers. This is one of the most useful tools because it enables the comparison of statistics before and after implementing configuration changes to the environment. • Log Parser The SharePoint Operations team uses logging extensively when determining root causes of issues, including SharePoint trace logs and IIS and Unified Logging Service (ULS) application and service logs. Microsoft IT uses Log Parser as one of the tools to monitor traffic, determine traffic sources distribution, and establish performance baselines. This free tool parses IIS logs, event logs, and many other kinds of structured data by using syntax similar to Structured Query Language (SQL). For more information about Log Parser, refer to the Script Center resource at http://www.microsoft.com/technet/scriptcenter/tools/logparser/default.mspx. • Fiddler This tool is helpful for measuring caching, page sizes, authentication, and general performance issues. For more information, visit the Fiddler Web site at http://www.fiddler2.com/fiddler2/.
Tools for monitoring • Performance Dashboard for Microsoft SQL Server 2005( Nice set of Reports) • (note update for sql 2008) • Your own scripts (watch my blog for new ones) • SQL Server 2008 - Performance Studio • Beware of how large the database can grow • Great for benchmarking • Sql 2008 activity Monitor
Summary • 90% of it is the Database! • Map out what is happening • Check the fill factor • Split databases for performance (Tempdb, Logs, Wss_Content, etc) • Have good benchmarking / reporting in place • Check the growth and cpu usage • Work together with all parties • Plan and keep on planning • Check out some tools to make life easier
References • SharePoint Performance Optimization: How Microsoft IT Increases Availability and Decreases Rendering Time of SharePoint Sites, which discusses how Microsoft IT discovered opportunities to enhance SharePoint optimization. • Manage lists and libraries with many items • Performance Recommendations and Best Practices • Data protection and recovery for Microsoft® Office SharePoint® Server 2007 • The Company I work for • Document Management for SharePoint