430 likes | 655 Views
Corey Erkes, Manager Consultant Sogeti USA. SharePoint 2010 Search Deep Dive. SharePoint 2010 Deep Dive. About Me Manager Consultant within Sogeti SharePoint Practice Worked with SharePoint since V2 MCTS: Microsoft SharePoint 2010, Configuring Co-Leader of Omaha SharePoint User Group
E N D
Corey Erkes, Manager Consultant Sogeti USA SharePoint 2010 Search Deep Dive
SharePoint 2010 Deep Dive About Me • Manager Consultant within Sogeti SharePoint Practice • Worked with SharePoint since V2 • MCTS: Microsoft SharePoint 2010, Configuring • Co-Leader of Omaha SharePoint User Group • Coauthor of SharePoint 2010 Governance Book • Member of UNO IS&T Alumni Board
SharePoint 2010 Search Deep Dive Agenda • SharePoint 2010 Search Versions • SharePoint 2010 Foundation • Search Server Express • Search Server • SharePoint 2010 Server • FAST • Search 2010 Architecture • How to Configure • Crawl Component • Query Component • Associated Databases • How to Scale Out
SharePoint 2010 Search Deep Dive SharePoint 2010 Search Versions
SharePoint 2010 Search Deep Dive Wait, there are different flavors of Search? • SharePoint Foundation 2010 • Search Server 2010 Express • Search Server 2010 • SharePoint Server 2010 • FAST Search Server 2010 for SharePoint • Search Server 2010 Express is a separate product outside of SharePoint 2010, but when installed with SharePoint Foundation 2010, can provide a lot of functionality
SharePoint 2010 Search Deep Dive SharePoint 2010 Search Functionality Breakdown
SharePoint 2010 Search Deep Dive SharePoint 2010 Search Functionality Breakdown - Continued
SharePoint 2010 Search Deep Dive SharePoint 2010 Index Size Capabilities • SharePoint Foundation 2010 can be scaled out to over ~10 million with addition of search server and assign it to crawl different content databases
SharePoint 2010 Search Deep Dive Available Search Repositories
SharePoint 2010 Search Deep Dive Search Manageability
SharePoint 2010 Search Deep Dive So wait, Search Server Express is free? * - assumes SQL Server and not SQL Server Express
SharePoint 2010 Search Deep Dive Really, Search Server Express is free? That’s a lot of goodness for free!
SharePoint 2010 Search Deep Dive Unfortunately, FAST Search is not free!
SharePoint 2010 Search Deep Dive SharePoint 2010 Search Architecture
SharePoint 2010 Search Deep Dive Goodbye SSP, Hello SharePoint Search Service! • Search Service Application • Creation of Search Service Application\Proxy can be provisioned in one of three ways: • Central Administration Manage Service Applications Page • Central Administration Farm Configuration Wizard • PowerShell (how the cool kids do it!) • Creation of Search Service Application PowerShell Walk-Thru • http://blogs.msdn.com/b/russmax/archive/2009/10/20/sharepoint-2010-configuring-search-service-application-using-powershell.aspx
SharePoint 2010 Search Deep Dive SharePoint Search Roles • Four unique roles involved in Search • Web server role • Provides interface for searching • Query server role • Serves search results to web server(s) • Crawl server role • Responsible for crawling content • Database server role • Hosts the three databases associated with search • Property database • Crawl database • Search administration database
SharePoint 2010 Search Deep Dive Search Components WCF Call Query Component Crawler Search Service Application Proxy Query Server / Query Processor Web Front End Property Store Database Search Administration Database Index Propagation Index Content Data Sources Index Server SharePoint Web Sites Other Systems Custom Databases Shared Folders External Web Sites Connector(s) Crawl Database
SharePoint 2010 Search Deep Dive Database Role • A minimum of three databases are required to support Search: • Property databases • Contains metadata or associated custom properties for all crawled items • Crawl databases • Contains history of the crawl • Manages start and stop points of crawls • Database can have more than one crawl associated to it, but a single crawler can only be associated to one database • Search Administration database • Stores search configuration data such as scopes and refiners. • Contains security information for the crawl content
SharePoint 2010 Search Deep Dive Database Sizing • Calculations for sizing databases • Property databases • 0.046 x (sum of content databases) • Crawl databases • 0.015 x (sum of content databases) • Search Administration database • Allocate 10 GB • Database Characteristics • Property databases • Write-heavy, 1:2 ratio • Crawl databases • Read-heavy, 3:1 ratio • Should not be collocated with Property DB • Search Administration database • Equal read/write
SharePoint 2010 Search Deep Dive Crawl Role • Purpose of crawl server is to index content • Crawl runs under MSSeach.exe (SharePoint Server Search 14) • Crawl sever does not contain copy of index, index is streamed/propagated to Query server • No longer a single point of failure • Crawler component needs to be mapped to SQL crawl database • Possible to create multiple Crawl databases and Crawler components
SharePoint 2010 Search Deep Dive Crawl Architecture WCF Call Query Component Crawler Search Service Application Proxy Query Server / Query Processor Web Front End Property Store Database Search Administration Database Index Propagation Index Content Data Sources Index Server SharePoint Web Sites Other Systems Custom Databases Shared Folders External Web Sites Connector(s) Crawl Database
SharePoint 2010 Search Deep Dive Crawl Role – Fault Tolerance • Can be achieved by provisioning a secondary crawl component on a secondary server • Can be mapped to same SQL Crawl database • Having more crawl databases than Crawl components doesn’t make sense and wastes system resources • Crawl Database fault tolerance should be handled through SQL mirroring
SharePoint 2010 Search Deep Dive Crawl Role – Performance • Performance is improved by adding additional Crawl components as two or more are crawling content instead of one • Load is distributed across both Crawl components • Overlapping would not occur as items are crawled in batches by both crawlers
SharePoint 2010 Search Deep Dive Crawl Role – Distribution • Can be accomplished by doing the following: • Crawl Component 1 Crawl DB 1 • Crawl Component 2 Crawl DB 2 • Each web application host is assigned a crawl component and attempts to distribute load evenly across crawl databases • sales.company.com Crawl Component 1 Crawl DB 1 • hr.company.com Crawl Component 2 Crawl DB 2 • Distribution is based off # of items/doc id’s that are stored in crawl DB
SharePoint 2010 Search Deep Dive Crawl Role – Distribution Example • Let’s say you have two web applications • sales.company.com Crawl Component 1 Crawl DB 1 • hr.company.com Crawl Component 2 Crawl DB 2 • Crawl DB 1 contains 3000 items • Crawl DB 2 contains 10,000 items • New web application is provisioned: finance.company.com • No need to create additional crawl component or crawl DB • What crawl DB will new host be associated to?
SharePoint 2010 Search Deep Dive Query Role • Purpose of query server is to server up queries to WFE • Index is stored on Query server(s) • Query server(s) contains one or more Query Components • Query Component is mapped to only one Property Store DB • Query Component is where index that is propagated from Crawler resides
SharePoint 2010 Search Deep Dive Query Architecture WCF Call Query Component Crawler Search Service Application Proxy Query Server / Query Processor Web Front End Property Store Database Search Administration Database Index Propagation Index Content Data Sources Index Server SharePoint Web Sites Other Systems Custom Databases Shared Folders External Web Sites Connector(s) Crawl Database
SharePoint 2010 Search Deep Dive Query Component – Fault Tolerance • Highly recommended to create fault tolerance index by mirroring a Query component onto another server in the farm. • Check “Fail-over Query Component” if you only want fault tolerance and not increase in query performance.
SharePoint 2010 Search Deep Dive Query Component – Sizing the Index • Index will be approximately 3.5% of Index size • Don’t forget about size needed for mirror • Additional space needed for master merge • Example: • 100 GB Content Database • Index partition: 100 GB x 3.5% = 3.5 GB • Index partition mirror: 100 GB x 3.5% = 3.5 GB • Space for master merge: All index partitions x 3 • Total Space = (3.5 x 2) x 3 = 21 GB • Recommend having enough memory to fit 33% of the index in RAM.
SharePoint 2010 Search Deep Dive Query Component – Performance • Index size is the main bottleneck for query performance • Index contains 10 million documents = Avg. of 2 seconds per query • Index contains 20 million documents = Avg. of 4 seconds per query • Creating multiple index partitions is the key to reducing query times and reducing bottlenecks. A new index partition can be added through Search Application Topology in Central Administration.
SharePoint 2010 Search Deep Dive Property DB Store – Fault Tolerance & Performance • Fault Tolerance • SQL mirroring should be used to achieve fault tolerance. • Performance • Add addition Property Store DB if bottlenecks occur • Must first create new Property Store DB, then create new Query component and map to new Property Store DB • Additional Query component should not include mirror if performance is wanted • You will need to reset index and re-crawl as a new Query component (index partition) would be created
SharePoint 2010 Search Deep Dive Property Store DB – Add Query Component Property Store DB must be created before adding Query Component so it appears in dropdown
SharePoint 2010 Search Deep Dive Query Processor • Runs under w3wp.exe process • Processes a query by retrieving results from the index\Query Components • Utilizes the Property Store DB and Search Administration DB to obtain metadata and perform security trimming • Will load balance requests if more than one Query Component (mirrored) exists within the same Index Partition • Query Processor connects to every Property Store DB and Query Component to retrieve results • Unlike MOSS 2007 where the Query Processor ran on the WFE, any server can run the Query Processor in SharePoint 2010
SharePoint 2010 Search Deep Dive Query Processor – Fault Tolerance & Performance • Add additional Query Processor service to another machine in farm • Doesn’t have to be WFE • Requested will be load balanced in a round-robin fashion to each Query Processor • Search Query and Site Settings Service can be found in CA Services On Server
SharePoint 2010 Search Deep Dive Overall Search Architecture WCF Call Query Component Crawler Search Service Application Proxy Query Server / Query Processor Web Front End Property Store Database Search Administration Database Index Propagation Index Content Data Sources Index Server SharePoint Web Sites Other Systems Custom Databases Shared Folders External Web Sites Connector(s) Crawl Database
SharePoint 2010 Search Deep Dive Scale-out Decision Points http://www.microsoft.com/download/en/details.aspx?id=20066
SharePoint 2010 Search Deep Dive Performance Metrics Thoughts http://www.microsoft.com/download/en/details.aspx?id=20066
SharePoint 2010 Search Deep Dive Small Farm Topology http://www.microsoft.com/download/en/details.aspx?id=20066
SharePoint 2010 Search Deep Dive Medium Farm Topology http://www.microsoft.com/download/en/details.aspx?id=20066
SharePoint 2010 Search Deep Dive Medium Search Farm Topology http://www.microsoft.com/download/en/details.aspx?id=20066
SharePoint 2010 Search Deep Dive Medium Dedicated Search Farm Topology http://www.microsoft.com/download/en/details.aspx?id=20066
SharePoint 2010 Search Deep Dive Large Dedicated Search Farm Topology http://www.microsoft.com/download/en/details.aspx?id=20066
SharePoint 2010 Search Deep Dive References Search Technologies for SharePoint 2010 Products http://download.microsoft.com/download/0/0/0/00015E0A-67CD-490C-9C1B-DCFA8E9BAEFC/Search%20Model%201%20of%204%20-%20Search%20Technologies.pdf SharePoint Brew – Search 2010 Architecture and Scale, Part 1 Crawl http://blogs.msdn.com/b/russmax/archive/2010/04/23/search-2010-architecture-and-scale-part-1-crawl.aspx SharePoint Brew – Search 2010 Architecture and Scale, Part 2 Query http://blogs.msdn.com/b/russmax/archive/2010/04/23/search-2010-architecture-and-scale-part-2-query.aspx