1 / 15

The Future of Isite - Growing GILS

The Future of Isite - Growing GILS. Archie Warnock A/WWW Enterprises http://www.clark.net/pub/warnock/awww.html warnock@clark.net. What Is Isite?. Isite is a standards-based Internet toolkit for information search and retrieval (Z39.50) Isite was developed by MCNC/CNIDR

roz
Download Presentation

The Future of Isite - Growing GILS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises http://www.clark.net/pub/warnock/awww.html warnock@clark.net

  2. What Is Isite? • Isite is a standards-based Internet toolkit for information search and retrieval (Z39.50) • Isite was developed by MCNC/CNIDR • Isite was intended as a replacement for freeWAIS • Funded by a US NSF grant • There are other good Z39.50 toolkits, too

  3. Isite Architecture • Isite is written in C++ to utilize the usual object-oriented advantages • Major components • Isearch - the search and retrieval engine • SAPI - the Z39.50 search engine API • Zdist - the Z39.50 implementation

  4. Isite Architecture - Example Programs • Iindex, Isearch, Iutil - the search engine • Isearch-cgi - the CGI gateway to Isearch • zclient, izclient, zping, zbatch - the Z39.50 clients • zserver, zserverNT - the Z39.50 servers • zcon & zgate - the WWW-to-Z39.50 gateway

  5. Current Status of Isite • MCNC/CNIDR funding from NSF is finished • Successful completion of 3 year grant • Jim Fullton, PI, is now at WIPO in Geneva • No additional support is anticipated • Other projects are supporting customization • FGDC, US Dept. of Commerce, US Patent & Trademark Office, CEO, STScI, World Bank, BSn

  6. Isite Strengths • Powerful and flexible search engine • Community-based development of a reference implementation • Freely distributed and widely available for any use • Source code included • Powerful search engine interface • Ported to Windows NT with threaded Z39.50 server

  7. Full text search Search on text fields Search on numeric fields with appropriate relations (>, <, =) Search on date fields with appropriate relations (before, during, after) Search on geospatial bounding box Boolean searches Phrase searching Right truncation Proximity searching (within N characters) Case insensitive searching, punctuation ignored Configurable stopword list Customizable results presentation Relevance ranked scores Term weighting Isearch Features

  8. ASCII text USMARC records Electronic mail folders Usenet news archives US patents IAFA templates BIBTeX Filenames First line in file SGML tagged fields HTML GILS templates FGDC templates Colon delimited fields GCMD DIF templates whois++ templates Multi-file documents Medline Isearch Document Types

  9. Modest Z39.50 implementation needs GRS-1 better USMARC support data structures All examples are console applications No real end-user applications No GUI interface Difficult configuration Requires programming for extensions Needs optimization & performance enhancement Needs more documentation Isite Weaknesses

  10. What The Future Holds For Isite • New Projects want (and will get): • Distributed document collections • Distributed searching • Automated information extraction (centroids, templates) • Searching and referrals • Additional Z39.50 support (lots of Z39.50 details are not supported now)

  11. GILS and the Advanced Search Facility • ASF is a US Dept. of Commerce project, to be built by Pilot Research, MCNC and A/WWW Enterprises • “GILSnet” - a network of cooperative, low-impact, distributed nodes • The basic interchange will be GILS templates • Search on full text and GILS records

  12. GILS, Dublin Core and Everyone Else • Dublin Core is a minimal (15 fields) generic metadata scheme for virtually any kind of document • GILS represents a more detailed approach, including most of DC, providing greater interoperability • GILS is less bibliographically oriented than BIB-1 • GILS is lightweight compared to GEO and CIP (which have specific functional requirements

  13. Fewer fields More documents More metadata records Skinnier metadata records Easier abstraction More fields Fewer documents Fewer metadata records Fatter metadata records Less abstraction What GILS Means To Me -1  GILS is a good, general compromise

  14. What GILS Means To Me - 2 • Think of the GILS profile as defining a language • At some level, Z39.50 is a detail • Protocols are about communication, profiles are about abstraction and GILS is about content • Z39.50 guarantees that the user’s query can be unambiguously decoded - no guarantees about content • We could implement the profile over any protocol - http, CORBA, etc. • Does GILS have to use Z39.50? • No, but the abstraction is required • Z39.50 already includes the abstraction model

  15. Related Documents • Getting Isite • ftp://ftp.cnidr.org/pub/software/Isite • ftp://ftp.clark.net/pub/warnock/Software (pre) • A/WWW Enterprises • warnock@clark.net • http://www.clark.net/pub/warnock/awww.html • US Phone/FAX: 301-854-2987

More Related