100 likes | 184 Views
WebScan: Implementing QueryServer 2.0. Karl Geiger, Amgen Inc. kgeiger@amgen.com BRS NA UG 1999 27 August 1999. WebScan: Objectives. Provide synoptic access to Libraries on-line information resources clinical, research, and business e-journal subscriptions news sites web search sites
E N D
WebScan: Implementing QueryServer 2.0 Karl Geiger, Amgen Inc. kgeiger@amgen.com BRS NA UG 1999 27 August 1999
WebScan: Objectives • Provide synoptic access to Libraries on-line information resources • clinical, research, and business e-journal subscriptions • news sites • web search sites • Develop an extensible software framework for resource collections • numbers of valuable resources are increasing daily • maintenance of static HTML pages is too time consuming • no staff time to keep up a directory system a la Yahoo!
WebScan: Materials • Dataware Technologies Query Server 2.0 • commercially available from known vendor • fully customizable • runs on web server supporting Common Gateway Interface • Amgen standard web servers and clients • Netscape Communicator 4.5 • Netscape Enterprise Server 3.6 • Amgen Libraries primary server, library, Sun Solaris for RAS • Libraries internet resource collection • Internet Ports O' Call sites • E-Journals • news and web search engines as selected by the team
WebScan: QueryServer Operation • Single CGI module • permits metering of service, up to 64 simultaneous users • permits web statistics gathering on server • Multi-threaded operation executes in parallel • client submits a query (keywords, selected resources, control parameters) • query is analysed and sent to individual threads, one thread per resource • each thread re-formats the query for the target resource and sends it via HTTP • threads wait for responses or errors (or timeouts) • threads parse responses for search hits or "not found" • threads send analyzed results to main routine which summarizes, deduplicates, and clusters results • main sends unified list of results to users
Webscan: HTML Interface Design • Flexible, expandable interface • categorize resources using Ports O' Call structure • handle links to hundreds of web sites, each possible appearing in two or more categories • permit recategorization of resources and addition of new categories • avoid editing multiple HTML or other documents • The list of resources is a simple database • resources have names (titles), home pages, categories, formatting options, enable/disable switches • resource descriptions are small (< 250 bytes) • total resources estimated to be small (< 1000) • use embedded JavaScript to describe resources as a list of objects
WEBSCAN: HTML Interface Design • Resource object JavaScript design -- each resource must • display its URL as a link • generate <FORM> entry data for the CGI request • associate itself with a category • display itself on a Help page • prevent display of itself if disable • Embedding the database in WebScan's search form • HTML page "pulls" JavaScript objects using the SRC= parameter on the <SCRIPT> tag • JavaScript creates the table of resources by a single call • tabular layout of resources controlled by a flexible routine
WebScan: Results • HTML interface is easy to maintain • easily maintainable in text editors -- no cost for .asp or Cold Fusion tools • object design makes it easily modifiable and extensible • Resources are easy to categorize • Categories are easy to add • Pure HTML and JavaScript runs fast • downloads quickly to client browsers • remains cached in browsers • updated HTML pages reconfigure themselves
WebScan: Project Timeline • QS 1.0 beta in summer 1998 • Purchased in December 1998 • Project team formed in April 1999 • Categorization for 150 resources done by mid-May • Software and documentation complete by mid-June • usability testing • quality check: uh-oh moments • WebScan launched on schedule 1 July • third to fifth most popular library page in first week • usage patterns an user suggestion to determine new categories and resources
WebScan: Network Impact • Metasearching is efficient • less human time spent searching multiple web sites serially • search threads accept only HTML output; no wading through images, advertisements, fetches to double-click.com • summarized results list is browsable at leisure, reducing number of metasearch requests • over good impact on firewall and proxy server
WebScan: More Info • Code examples • Interface example • Questions?