170 likes | 280 Views
Metadata Metamorphosis. The evolution of a website (or is it?!) Gillian Byrne Dianne Cmor Information Services, QEII Library Memorial University. What is Metadata?. Descriptive, structured data about data Defined by communities Digital resource connection
E N D
Metadata Metamorphosis The evolution of a website (or is it?!) Gillian Byrne Dianne Cmor Information Services, QEII Library Memorial University
What is Metadata? • Descriptive, structured data about data • Defined by communities • Digital resource connection • Public uses (access) and non-public uses (management)
Why Metadata on Websites? Improved searching • Internal • External Content management • Content organization (author, date) • Content groupings (subject, audience)
Why Metadata on Websites? Future Possibilities • More-like-this links • Context sensitive navigation/relations • Dynamic streaming
Dublin Core • Universal, general metadata set • Ease, simplicity, flexibility, interoperability (more than just MARC), extensibility (growth and local needs) • 15 elements – optional and repeatable • Potential for cross-domain discovery
Content Management Systems • Off-the-shelf systems do it all: • Authoring, describing, collaborating, workflow, security, scheduling, templating, personalization • Scalability • Separation of content and design • Memorial Libraries’ CMS specific to our needs
Memorial Libraries’ CMS • Web-based interface • File management (add, delete, move) • Metadata management • Ownership and timelines (who, when) • Descriptive fields (title, subject, keywords) • “Extra” fields (show contributor, printer friendly)
Memorial Libraries’ CMS Show and Tell! • File management • Metadata management • Those little extras
Metadata and Search Engines Internal Searching • Criteria for selecting an internal search utility: • Metadata enabled • Cost • Search features • Control over installation and indexing • Selected 3 engines for testing: Inktomi, ht://Dig and Webinator
Search Engine Testing Testing Methodology • Selected 10 pages from the new site • Created 5 versions of each page • Version 1: text of the page with no metadata • Version 2: metadata terms into the header of the page as text • Version 3: metadata terms in the coding with <meta> tags (using the keyword field) • Version 4: metadata terms in the coding using Dublin Core • Version 5: Removed the text of the page and placed the metadata terms in the coding with <meta> tags
Search Engine Testing • Selected search terms for each set – search terms had to be present both in the metadata and the text. • For each of the 5 searches: • Compared the page ranking in the three engines to see which one consistently ranked version 3 or version 4 highest. • Also looking for version 5 over version 2
Search Testing - Results Test Page: Borrowing (www.library.mun.ca/borrowing/information.php) Term Searched: intercampus
Search Testing - Conclusions • Inktomi fared the best • Consistently ranked version 3 over all other versions • Did better at ranking version 5 over version 2 than Webinator & ht://Dig • None of the search engines read the Dublin Core metadata
Our Search Engine Show and Tell! • Better relevancy for poor searchers • More flexibility in search terms • Better access to buried pages • Administrative features
External Search Engines • <Meta> tags • Majority of search engines do not index <meta> tags because of spam implications • Inktomi web engine the major exception (Inktomi powers Lycos/HotBot, MSN, Overture and was just purchased by Yahoo) • Dublin Core • No major commercial engine recognises Dublin Core metadata • Fertile testing field?
Evolution? Or not? Depends on needs, resources, crystal balls Benefits • Who/when information • Improved internal searching • Automatic site index • Future possibilities Drawbacks • Initial work • More work for content providers
Evolution? Or not? Suggestions • Small sites • why not? manageable enough • easier to begin when small • more benefits in future? • Large sites • don’t do extensive metadata for improved searching ONLY - top level pages are enough • look for site management as well