230 likes | 460 Views
Using Literary Warrant to Define a Version of the DDC for Automated Classification Services. Diane Vizine-Goetz Research Scientist, OCLC Research Julianne Beall Assistant Editor, DDC ISKO Conference London, 13-16 July 2004. Exploratory Study. Defining a version of the DDC
E N D
Using Literary Warrant to Define a Version of the DDC for Automated Classification Services Diane Vizine-Goetz Research Scientist, OCLC Research Julianne Beall Assistant Editor, DDC ISKO Conference London, 13-16 July 2004 © 2004 OCLC Online Computer Library Center, Inc.
Exploratory Study • Defining a version of the DDC • To facilitate automatic assignment of DDC numbers to electronic documents • Based on literary warrant for topics in electronic resources
DDC for Automated Classification • Machine classification service • A database of concepts used to classify a document • Software that generates a prioritized list of concepts that characterize the content of the document (Scorpion)
Checking Literary Warrant • Primary source for checking literary warrant: BUBL • Ca. 12,000 Internet resources • Canadian Information By Subject • Ca. 10,000 Internet resources • KidsClick! • Ca. 6,400 Internet resources
Defining a Version of the DDC • Starting point: classification numbers in Abridged Edition 14 • True abridgment: the truncated number for a topic is always the same as the full number for the topic, except shorter, e.g.: • 551.64 Forecasting and forecasts of specific phenomena • Cut back to 551.6 Climatology and weather
Database Record • Class number • Caption • Superordinate hierarchy • Notes that describe what is found in a class • Relative Index entries • Mapped terminology
Keywords from 551.64 Added to 551.6; 551.64 Deleted • Class-here note: methods of forecasting specific phenomena specific areas • Relative Index entries, e.g., • Acid rain—weather forecasting • Hurricanes—weather forecasting • Rain—weather forecasting • Subject Headings for Children LCSH • Storms—Forecasting
Enriching Terminology for Numbers Built from Table 1 • Example: built number 520.6 • 520 Astronomy and allied sciences • Relative Index terms that approximate the whole of 520: • Astronomy • Celestial bodies • Outer space • Space—astronomy
Built Number 520.6 • Relative Index terms from T1—06, e.g.: • Associations • Organizations • Combined entries for 520.6, e.g.: • Astronomy—associations • Astronomy—organisations • Astronomy—organizations • Celestial bodies—associations • Celestial bodies—organisations • Celestial bodies—organizations
Added UK Spellings for Index Entries 512.7 Number theory Factorisation—number theory Factorization—number theory Number theory Prime numbers 519.6 Mathematical optimization Mathematical optimisation Mathematical optimization Optimisation—mathematical Optimization—mathematics
Results: Scorpion & BUBL A14.v1 base file + UK spelling A14.v2 base file + UK spelling + SS added/enriched A14.v3 base file + UK spelling + SS added/enriched + truncation
Next Steps • Analyze where the truncation and the enriched terminology were useful and where not; revise the v3 database accordingly • Extend approach to additional classes and projects (ePrints UK)
Links • Research : Projects : ePrints-UK • http://www.oclc.org/research/projects/mswitch/epuk.htm • Dewey • http://www.oclc.org/dewey/