180 likes | 294 Views
LIS618 lecture 6. Thomas Krichel 2003-03-05. structure. DIALOG basic vs additional index initial database file selection (files) Lexis/Nexis. basic vs additional index. the basic index has information that is relevant to the substantive contents of the data
E N D
LIS618 lecture 6 Thomas Krichel 2003-03-05
structure • DIALOG • basic vs additional index • initial database file selection (files) • Lexis/Nexis
basic vs additional index • the basic index • has information that is relevant to the substantive contents of the data • usually is indexed by word, i.e. connectors are required • the additional index • has data that is not relevant to the substantive matter • usually indexed by phrase, i.e. connectors are not required
search options: basic index • select without qualifiers searches in all fields in the basic index • bluesheet lists field indicators available for a database • also note if field is indexed by word or phrase. proximity searching only works with word indices. when phrases are indexed you don't need proximity indicators
search in basic index • a field in the basic index is queried through term/IN, where term is a search term and IN is a field indicator • Thomas calls this a appending indicator • several field indicators can be ORed by giving a comma separated list • for example mate/ti,de searches for mate in the title or descriptor fields
limiters and sorting • Some databases allow to restrict the search using limiters. For example • /ABS require abstract present • /ENG English language publication • Some fields are sortable with the sort command, i.e. records can be sorted by the values in the fields. Example: “sort /ti” Such features are database specific.
additional indices • additional indices lists those terms that can lead a query. Often, these are phrase indexed. • Such fields a queried by prefix IN=term where IN is the field abbreviator and term is the search term • Thomas calls this a pre-pending indicator
expanding queries • names have to be entered as they appear in the database. • The "expand" command can be used to see varieties of spelling of a value • It has to be used in conjunction with a field identifier, example expand au=cruz, b? to search for misspellings of José Manuel Barrueco Cruz
expanding queries II • search produces results of the form Ref Items Index-term • Ref is a reference number • Items is the number of items where the index term appears • Index-term is the index term • "s Ref" searches for the reference term.
add/repeat • add number, number adds databases by files to the last query • example "add 297" to see what the bible says about it • repeat repeats previous query with database added
Initial file selection • On the main menu, go to the database menu. • After the principle menu, you get a search box • There you can enter full-text queries for all the databases • You can then select the database you want • And get to the begin databases stage.
database categories • In order to help people to find databases (files), DIALOG have grouped databases by categories. • categories are listed at http://library.dialog.com/bluesheets/html/blo.html • 'b category' will select databases from the category category at the start. • 'sf category' selects files belonging to a category category at other times.
Lexis/Nexis • Lexis is a specialized legal research service • Nexis is primarily a news services • adds an important temporal component to all its contents • restricts contents as compared to Dialog • potentially bad competition from Google
compilation of Nexis • Uses a number of news sources such as newspapers. • Uses company reports databases • Uses web sites, the URLs of which are found in the news sources • There is a problem with quality control of web sites, some pornographic sites are included
Smart indexing • Nexis keep a list of terms that are used for indexing. • A computer program will relate synonyms to an official term. • Example: replace “LIU” with “Long Island University” • Queries are not pre-processed.
Nexis basic search • implicit Boolean "or" between terms • Otherwise double quotes for • in fact searches • Smart index keywords extracted • HLEAD for news • TITLE for legal documents • WEB-SEARCH-TEXT for web pages
relevance ranking • Lexis is based on the vector model • The precise relevance ranking seems a company secret. Ranking depends on • where terms appear within the document • how many occurrences of the search terms appear in the document • how often those search terms appear throughout the document