40 likes | 148 Views
HIVE as a Machine-aided Indexing Tool. Personal Keyword use without vocabulary control Machine-aided indexing term extraction Participant relevant and not relevant judgments Inter-indexing consistency Rolling’s Measure Hooper’s Measure. Organizing Scientific Data Sets.
E N D
HIVE as a Machine-aided Indexing Tool Personal Keyword use without vocabulary control Machine-aided indexing term extraction Participant relevant and not relevant judgments Inter-indexing consistency Rolling’s Measure Hooper’s Measure
HIVE/Dryad Evaluation Craig Willis, Hollie White, Lee Richardson, Casey Rawson Jane Greenberg, Bob Losee, Ryan Scherle, Todd Vision • Questions • Given Dryad article metadata (title, abstract, depositor-supplied keywords), what are the best approaches for term suggestion from selected controlled vocabularies (MeSH, ITIS, TGN)? • Can one approach be used for subject, taxonomic and geographic indexing? • Method • Create “gold standard” of manually index records based on mapping of Dryad, MEDLINE and BIOSIS Previews to MeSH, TGN, ITIS • Evaluate state-of-the-art techniques for automatic subject, and taxonomic, and geographic indexing • Preliminary results • For taxonomic name indexing, untrained KEA++ performs almost as well as state-of-the-art taxonomic name extraction (FindIt) • For geographic name indexing with TGN, simple graph-based ranking algorithm outperforms KEA++.
Thesaurus Walking: Automatic Indexing with Controlled Vocabularies Craig Willis, Bob Losee, Jane Greenberg • Questions • Starting from the location of terms in a document and moving to the indexer assigned controlled terms, how do indexers navigate in a thesaurus? • How can this knowledge be used to improve techniques for automatic indexing with controlled vocabularies? • How can this knowledge be used to improve thesauri? • Methodology • Unsupervised, graph-based approach using random walks on thesauri • Preliminary results • Indexer assigned controlled terms are identified at a rate much higher than random, but far from perfect. • Suggests that this method could best be used in combination with other dissimilar automatic indexing methods.