570 likes | 597 Views
Learn about Explorit, a federated search solution by Deep Web Technologies. Explore features, benefits, customization options, and integration capabilities.
E N D
Swets EPM Training Explorit Research Accelerator Focus Deep. Get Results.
Agenda Goals Background Hosted vs. Installed Features and Benefits Customization Connectors Authentication Administration SLA Pilots The Future Additional Resources Questions & Answers
Training Goals Present Explorit to potential customers Respond to detailed Q&A Respond to tender/RFP
About Deep Web Technologies... Founded by Abe Lederman, a co-founder of Verity, 2002 Pioneered federated search technology Over $3M in R&D Production applications since 1999 Based in Santa Fe, New Mexico 22 person company with strong executive team
Customers Include... Boeing Defense Technical Information Center DOE Office of Scientific & Technical Information George Mason University Intel Corporate Library • Missouri Digital Heritage • National Agricultural Library • Science.gov Alliance • Scitopia.org • Stanford University • UCSF Medical School • WorldWideScience Alliance
Sites Running Explorit Scitopia.org WorldWideScience.org Science.gov Nutrition.gov Biznar Mednar ScienceResearch.com
What is “Federated Search”? “Federated Search is an application or service that allows users to submit a real-time search in parallel to multiple, distributed information sources and retrieve aggregated, ranked and de-duplicated results.”
In Other Words…One Search, Many Sources Begin Search Subscription Sources News Internal Sources E-Books Library Catalogs Public Web Sources
What is the Best Solution for a Customer? Hosted, Uses Explorit UI Licensed, Uses Explorit UI Hosted, Accessed via Web Services Licensed, Accessed via Web Services
Explorit Web Services API Robust standards-based API Easy to integrate into own Institution’s portal Highly scalable Locally deployed or provided as SaaS Supports all Explorit functionality through a number of Web Services: Search Service (asynchronous) Collection Service User Service Alerts Service Statistics Service Web-based Administrative Console
Requirements for Licensed Solutions Explorit runs on Linux (CentOS 5.4, Suse Enterprise 11, Redhat Linux 5) Explorit runs on Windows Server Explorit requires Tomcat Application Server Version 5.5 or later Explorit requires MySQL or SQL Server
Major Advantages of an Explorit Solution Rich, easy-to-use, AJAX-based Web 2.0 Interface Incremental display of search results Sophisticated/powerful connector building capability Rich set of search capabilities Retrieval of 50-100 or more results per source Relevance ranking (Admin tunable)
Major Advantages of an Explorit Solution (cont.) Smart clustering and grouping of results Alerts delivered via email or RSS SearchBuilder which enables custom search pages to be created Selection of search results and export through Direct Connect Administrative Console including rich set of graphics-based metrics
Explorit Scalability Searches more collections in parallel than any other federated search Handles large number of concurrent searches Able to federate other federated search engines
Explorit Search Features Basic Search Advanced Search Grouping of Collections into 2-level Menu Boolean Searching (AND, OR, NOT) Fielded Searching Author Normalization Date Range Searches Stemming / Wildcard truncation
Explorit Result Page Features Incremental Results Merged/Ranked Results Sorting Results Refine Results Limiting Results to a Single Source Smart Clustering Selection of Results Email / Printing of Results Export of Results (see Direct Connect)
Relevance Ranking – How it Works Is based on density of search terms in title/snippet Search terms at beginning/end of title boost score Higher/lower weights can be assigned to sources Higher weights can be assigned to more recent results Weighting of proximity of words
Deduplication Configurable through Admin Interface Deduplicate on a field or combination of fields Assign deduplication order priority to sources Example: URL or “Title and Date”
Explorit Add-on ProductsAlerts Alerts Users register and maintain Alerts Home Page where they can: Create/Update/Delete Alerts View up to 6 most current Alert results Delivered daily, weekly or monthly Can be setup to generate a table of contents Delivered via email, RSS or ATOM Maintains a database of past results seen
Explorit Add-on ProductsSearch Builder/Direct Connect Search Builder Create custom search boxes and search pages easily Enable/disable collections and search fields Integrates with Course Management software Direct Connect Integrates with Selections capability Export result metadata Seamless integration with RefWorks and EndNote
Supported Browsers Internet Explorer Version 7.0 or higher Firefox Version 2.0 or higher Google Chrome Safari Supports screen resolution of 1024 x 768
Explorit Customer Customization Incorporate customer header/footer, including logo Specify fields to search on Specify formatting of results Select fields and order for Smart Clustering Advanced Search page Results list page Results email Alerts Search Builder
Swets Branded Explorit interface Swetswise Searcher UI look-and-feel Admin interface look-and-feel Graphics Advanced Search page Results list page Results email Alerts Search Builder
Connectors – Protocols Supported Web Services (including SR/U SR/W) XML Gateways Z39.50 “Screen Scraping” (HTTP)
Connectors – Access Methods Public sources requiring no authentication IP-based authentication User name / password An organization’s proxy server Coming: Use of individual username and passwords
Z39.50 Connectors Robust Z39.50 connector that searches 20+ fields Z39.50 connector tailored to work with major catalogs such as: SirsiDynix – Unicorn SirsiDynix – Horizon SirsiSynix - Classic Dynix Ex Libris – Voyager Ex Libris – Aleph Innovative – Millennium VTLS – Virtua Polaris Connector Monitoring
Connector Monitoring Proactively monitor connectors Performed by dedicated software maintenance engineers Monitor: source health, speed, responsiveness and errors Generally errors are discovered by our team before users ever notice a problem
Authentication We use Acegi Security to integrate with the following: LDAP Shibboleth Athens
Explorit Proxy Server The Proxy Server supports forward proxy configuration for searching sources that require IP Authentication. The Proxy Server supports reverse proxy configuration for searching sources that require any kind of HTTP authentication (Basic and/or form based). The Proxy Server provides links to documents for sources that require authentication to retrieve search results and to retrieve full text documents from search results.
Proxy Configuration Proxy url and configuration information (username/password, etc.) may be configured through Strata admin interface (connector parameter configuration). Proxy result urls search result lists are populated with result links that pass through the reverse Proxy to the protected content.
Proxy Configuration Proxy result urls search result lists are populated with result links that pass through the reverse Proxy to the protected content. Each set of result urls generated for a a specific user/search are cryptographically hashed and stored for only a server-configurable amount of time so that access to protected content may not be abused.
Open URL – Link Resolving Works with major link resolvers (SFX, Serials Solutions, Openly) Passes DOI or combination of: Journal title or ISSN Article Volume & Page Number
Explorit Administration Console Enable/disable connectors Connector monitoring & testing Usage metrics Click-through statistics Tune relevance ranking Configure de-duplication
Explorit Metrics Graphics-based or tabular Single day (hourly breakdown) or entire month Downloadable to spreadsheet Reports include: Number of queries run Number of results retrieved per source Average time to retrieve results from a source Average rank of results retrieved per source Timeouts/errors by source Searches run (query strings) Coming:COUNTER compliance
Guidelines for Building Pilots Prospect must have seen demo of Explorit Prospect must have budget and be actively evaluating federated search Pilots must be limited to 6 sources Prospect must provide credentials / access to source before start of pilot development Pilot should run for no more than 60 days A pilot agreement between Swets and the customer is a good way to show customer “buy-in”