280 likes | 353 Views
Search can be Y our B est F riend Y ou just N eed to K now H ow to T alk to it. IW 306 Ágnes Molnár. About me. Ágnes Molnár , MOSS MVP, MCS D , MCT Senior Consultant, R&D Director L&M Solutions, Budapest, HUNGARY http://aghy.dotneteers.net molnar.agnes@lmsolutions.hu.
E N D
Search can be Your BestFriendYou just Need to Know How to Talk to it IW 306 Ágnes Molnár
About me • Ágnes Molnár, MOSS MVP, MCSD, MCT • Senior Consultant, R&D Director L&M Solutions, Budapest, HUNGARY • http://aghy.dotneteers.net • molnar.agnes@lmsolutions.hu
Search can be Your BestFriendYou just Need to Know How to Talk to it
WHY? • Information overload • Findability • Gartner: 8 hours / week / information worker • IDC: 9.5 hours / week / information worker • Searching without finding: 3.5 hours / week / information worker COMPLETELY WASTED!
Business Requirements What? Where? How?
iFilters Database Metadata and Permissions Content Protocol handlers: HTTP, FTP, File, BDC, Lotus Notes, Custom Word BreakerNoise Word Removal Content Sources Full Text Index Physical Architecture: Server Roles • Index Server • Crawling • Indexing
? 2 5 Web Front-End Servers 4b 3b 3a 4a Query Servers 1 1 1 Index Server Database Server Physical Architecture: Server Roles • Query Server • Accept search queries from users • Build return set • Resturn results
Physical Architecture: Scaling out • More Index Server • More than 50 (10) million documents • Too long crawling time • Second SSP needed • More Query Server • Need to include content that cannot be crawled • Query demand is rising • More Index Server
Search Scopes • Refine the queries • Scope Rules • Web address • Property query • Content source • All content
Keywords and Best Bets • Keyword: • to mark specific items as more relevant • they show up more prominently in the search results • Best Bet: • relevant items that you can choose for a subject
Federated Search • Advantages • Conserve resources by crawling and indexing • Can include content that cannot be crawled • Latest information from different content sources
Federated Search • Disadvantages • Unable to configure ranking within the result set • Unable to control which results appear in the result set • Cannot scope the results • Cannot combine the results into a single result set • The more search webparts on the same page the more time to load
Federate or not? YES NO You don’t have enough bandwith content changes very often, but immediately crawling NOT needed content that is not indexed by the remote server remote server does not return with RSS or Atom • remote site’s robots.txt blocks SharePoint’s crawler • you need results only with specific keywords and/or keyword patterns in the query • content changes very often, immediately crawling needed • queries under different security context • infrequently queried contents • >500 content sources
Findability Best Practices • Use Scopes • Use Master Site Directory • Train your users • URLs and Managed Paths • Content Types – Describing and Tagging • My Sites and User Profiles • Blog, Wiki • Collaboration • Knowledge Sharing
User Interface • UI scenarios: • SharePoint • Browser integration • Custom application • Separate several search results • Easy to search – easy to use • Use RSS / e-mail alerts
The Magic Word: SEO(Search Engine Optimalization) • The process of optimizing sites and pages for search engines to result in better relevance and ranking for the site.
SEO Best Practices – DO • Use keywords • Place your content as high up in the page as possible to get it more relevance • Use clear site hierarchy – every page has to be reachable • Check for broken links • Use a text browser (eg. LYNX) to examine your site • Test your site in different browsers
SEO Best Practices – DO • Use keywords • Place your content as high up in the page as possible to get it more relevance • Use proper semantic codes: • <meta> tags (title, description) • Headlines (<h1>, <h2>, ...) • List items (<ol>, <ul>, <dl>) • Images: <alt>, <title> • Use descriptive text in your hyperlinks • Use descriptive page titles • Build site map • Use valid HTML and XML
SEO Best Practices – DO NOT • Don’t name all pages with the same page title • Don’t load your pages with irrelevant keywords • Don’t use complex URLs • Don’t use temporary redirects • Don’t use complex pages • Avoid web spammers
SEO Best Practices MAKE YOUR PAGES PRIMARILY FOR USERS, NOT FOR SEARCH ENGINES!!!
Thank you for attending! Please be sure to fill out your session evaluation!