200 likes | 208 Views
Learn how to construct structured queries using connectors and Boolean operators to retrieve relevant information from databases. Explore the use of proximity and set handling to refine search results. Understand the importance of database structure and the blue sheet in optimizing search queries.
E N D
LIS618 lecture 5 Thomas Krichel 2003-02-26
structure • operations on any file (database) • connectors • booleans • set handling • display • file-specifics • bluesheet • Introduction to structured queries
Use of connectors • Connectors are used to put several words together. • One instance where this is useful is when you have words that on their own mean different things. • For example "mate" is a herbal beverage consumed in South America. Looking for mate on the Internet retrieves a lot of singles' pages.
example: terms related to "mate" What other terms to be used? • matear (drink mate) • matero (mate drinker) • cebar (prepare mate) • cebador (mate preparer) • yerba (mate herb) • bombilla (mate straw)
connectors I • '(W)' requires terms to appear one after the other next to each other e.g. 'yerba(W)mate?' matches "yerba mate". • '(i W)' where i is an integer, means followed by at most i words, e.g. 'ceba?(3W)mate?' matches "cebar un maravilloso mate" but not "cebador guapo mirando un buen mate"
connectors II • '(N)' requires terms to be next to each other e.g. 'yerba(N)mate?' matches "yerba mate" or "mate yerba". • '(i N)' where i is an integer, means proximity by at most i words, e.g. 'ceba?(3N)mate?' matches "cebar mate" or "matear con la cebadora". • '(S)' searches for the occurrence of connected terms in the same paragraph.
using Boolean operators • In your query, you can combine several expressions with Boolean operators • Example: "S LIBRARY(W)SCHOOL? AND DISTANCE(W)EDUCATION" • But I usually do not issue such fancy queries.
executing several searches • there can be several searches done sequentially, and the results sets are saved by the system. • Each time the system assigns a set number, Si, • These can be combined in Boolean expressions, e.g. 's S1 or S2 and S3' • Remember that Boolean operations are set-theoretic!
Boolean operators on sets • when using Booleans, be aware that "and" has higher precedence than "or". • Thus: a or b and c is not the same as (a or b) and c but it is a or (b and c) • use parenthesis when in doubt
DS (display sets) • This command can be executed any time to review the sets that have been formed since the last B (begin) command. • This can be useful to review your search history.
the target command • "target set" where set is a search result set creates a subset of the "statistically most relevant results" in the original set. • I have not seen details about how this subset is computed. • new result set is being formed.
display: the type command type set/format/range • set is a result set • format is a format • range can be • start – end • start is a record number to start • end is a record number to end • all
standard delivery formats • 2 -- full record except abstract • 3 or medium – citation • 5 or long – full except full text • 6 or free – title and dialog number • 8 or short – title plus indexing terms • useful to find other indexing terms • 9 or full – everything • KWIC or K – keywords in context
options for delivery • I once tried to email results to me, to no avail • You can save the html of the search results in the browser. • You can print the results within the browser.
Looking at database structure • Up until now, we have looked at commands that take a full-text view of the database. • Such commands can be executed for every database. • If we want to make more precise queries, we have to take account of database structure.
blue sheet • each database name is linked to a blueish pop-up window called the blue sheet for the database • This is called the bluesheet. • It contains the details of the database.
closer look at the bluesheet • file description • subject coverage (free vocabulary) • format options, lists all formats • by number (internal) • by dialog web format (external, i.e. cross-database) • search options • basic index, i.e. subject contents • additional index, i.e. non-subject
basic vs additional index • the basic index • has information that is relevant to the substantive contents of the data • usually is indexed by word, i.e. connectors are required • the additional index • has data that is not relevant to the substantive matter • usually indexed by phrase, i.e. connectors are not required
search options: basic index • select without qualifiers searches in all fields in the basic index • bluesheet lists field indicators available for a database • also note if field is indexed by word or phrase. proximity searching only works with word indices. when phrases are indexed you don't need proximity indicators
http://openlib.org/home/krichel Thank you for your attention!