290 likes | 580 Views
Taxonomy Development Knowledge Structures. Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com. Agenda. Introduction Knowledge Structures Taxonomy Management Software Exercises Conclusion. Knowledge Structures.
E N D
Taxonomy DevelopmentKnowledge Structures Tom ReamyChief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com
Agenda • Introduction • Knowledge Structures • Taxonomy Management Software • Exercises • Conclusion
Knowledge Structures • List of Keywords (Folksonomies) • Controlled Vocabularies, Glossaries • Thesaurus • Browse Taxonomies (Classification) • Formal Taxonomies • Faceted Classifications • Semantic Networks / Ontologies • Topic Maps • Knowledge Maps
Knowledge StructuresLists of Keywords (Folksonomies) • Wikipedia: A folksonomy is an Internet-based information retrieval methodology consisting of collaboratively generated, open-ended labels that categorize content such as Web pages, online photographs, and Web links. • No onomy – simple collections of keywords • Key – social mechanism for seeing other tags • Popularity ranking – Tag Clouds • Sample sites – Del.icio.us and Flickr
Knowledge StructuresControlled Vocabularies, Glossaries • Controlled Vocabularies, Glossaries • Lists with minimum structure • Easy to develop • Difficult to get value from • Simple Reference resource • Thesaurus • Taxonomy-like • Less formal • BT, NT – also RT
Two Types of Taxonomies: Browse and FormalBrowse Taxonomy– Yahoo
Facets and Dynamic Classification • Facets are not categories • Entities or concepts belong to a category • Entities have facets • Facets are metadata - properties or attributes • Entities or concepts fit into one category • All entities have all facets – defined by set of values • Facets are orthogonal – mutually exclusive – dimensions • An event is not a person is not a document is not a place. • Facets – variety – of units, of structure • Date or price – numerical range • Location – big to small (partonomy) • Winery – alphabetical • Hierarchical - taxonomic
Knowledge StructuresSemantic Networks / Ontologies • Ontology more formal • XML standards – OWL, DAML • Semantic Web – machine understanding • RDF – Noun – Verb – Object • Vice President is Officer • Build implications – from properties of Officer • Semantic Network – less formal • Represent large ontologies • Synonyms and variety of relationships
Knowledge Structures: Ontology Instruments Music is a is a create Bluegrass Violins uses Musicians uses is a Violinists
Knowledge StructuresTopic Maps • ISO Standard • See www.topicmaps.org • Topic Maps represent subjects (topics) and associations and occurrences • Similar to semantic networks • Ontology defines the types of subjects and types of relationships • Combination of semantic network and other formal structures (taxonomy or ontology)
Knowledge StructuresKnowledge Maps • No standards – applied at high level • Ontologies plus / applied to specific environment • Map of Groups – Content Stores – Purpose – Technology • Add structure to each element • Facet Structure – filter by group – content – purpose • Strategic resource
Knowledge Structures: Which one to use? • Level 1 – keywords, glossaries, acronym lists, search logs • Resources, inputs into upper levels • Level 2 – Thesaurus, Taxonomies • Semantic Resource – foundation for applications, metadata • Level 3 – Facets, Ontologies, semantic networks, topic maps • Applications • Level 4 – Knowledge maps • Strategic Resource
Advantages of Folksonomies • Simple (no complex structure to learn) • No need to learn difficult formal classification system • Lower cost of categorization • Distributes cost of tagging over large population • Open ended – can respond quickly to changes • Relevance – User’s own terms • Support serendipitous form of browsing • Easy to tag any object – photo, document, bookmark • Better than no tags at all • Getting people excited about metadata!
Disadvantages of Folksonomies - Quality • They don’t work very well for finding • Re-finding is of marginal value • No structure, no conceptual relationships • Flats lists do not a onomy make • Issues of scale – popular tags already showing a million hits • Limited applicability – only useful for non-technical or non-specialist domains • Either personal tags (other’s can’t find) or popularity tags – lose interesting terms (Power law distribution) • Most people can’t tag very well – learned skill • Errors – misspellings, single words or bad compounds, single use or idiosyncratic use
Better Folksonomies: • Will social networking make tags better? • Not so far – example of Del.icio.us – same tags • Quality and Popularity are very different things • Most people don’t tag, don’t re-tag • Study – folksonomies follow NISO guidelines – nouns, etc – but do they actually work – see analysis • Most tags deal with computers and are created by people that love to do this stuff – not regular users and infrequent users – Beware true believers!
Browse Taxonomies: Strengths and Weaknesses • Strengths: Browse is better than search • Context and discovery • Browse by task, type, etc. • Weaknesses: • Mix of organization • Catalogs, alphabetical listings, inventories • Subject matter, functional, publisher, document type • Vocabulary and nomenclature Issues • Problems with maintenance, new material • Poor granularity and little relationship between parts. • Web site unit of organization • No foundation for standards
Formal Taxonomies: Strengths and Weaknesses • Strengths: • Fixed Resource – little or no maintenance • Communication Platform – share ideas, standards • Infrastructure Resource • Controlled vocabulary and keywords • More depth, finer granularity • Weaknesses: • Difficult to develop and customize • Don’t reflect users’ perspectives • Users have to adapt to language
Faceted Navigation: Strengths and Weaknesses • Strengths: • More intuitive – easy to guess what is behind each door • 20 questions – we know and use • Dynamic selection of categories • Allow multiple perspectives • Trick Users into “using” Advanced Search • wine where color = red, price = x-y, etc.. • Weaknesses: • Difficulty of expressing complex relationships • Simplicity of internal organization • Loss of Browse Context • Difficult to grasp scope and relationships • Limited Domain Applicability – type and size • Entities not concepts, documents, web sites
Dynamic Classification / Faceted navigation • Search and browse better than either alone • Categorized search – context • Browse as an advanced search • Dynamic search and browse is best • Can’t predict all the ways people think • Advanced cognitive differences • Panda, Monkey, Banana • Can’t predict all the questions and activities • Intersections of what users are looking for and what documents are often about • China and Biotech • Economics and Regulatory
Knowledge StructuresTaxonomy Management Software • Taxonomy Management • Multi-Tes, Data Harmony, SchemaLogic • Distributed Taxonomy Development • Wordmap, Wikionomy • Text Analytics – Entity Extraction • ClearForest, Inxight, Terragram • Auto-Categorization • ClearForest, Inxight, Terragram • Embedded software – Content Management, Search
Why Taxonomy Software? • If you have to ask, you can’t afford it • Spreadsheets • Good for calculations, days of taxonomy development over • (almost) • Ease of use – more productive • Increase speed of taxonomy development • Better Quality – synonyms, related terms, etc. • Distributed development – lower cost, user input (good and bad)
Decision Points • Dedicated taxonomy management software • Small company, specialized taxonomy • Real issue is how it will be integrated • Text analytics / auto-categorization • Dedicated software or use features of CM and/or enterprise search • Combination of dedicated and embedded • Integration – export and import is critical • Integration with Policy / Procedure • Distributed contributions
Taxonomy – How will it be used? • Browse front end to portal • Search engine indexing • Keyword searching • Hierarchical browsing – formal structure • Faceted navigation • Subject taxonomy and lots of metadata • Controlled vocabulary for entering metadata • Applications – text and data mining, alerts, etc. • Semantic Infrastructure
Evaluating Taxonomy SoftwareSelf Knowledge – Distributed model of taxonomy in action • People • Interdisciplinary Team • Knowledge architects, editors, SME, users • Roles • Select and implement taxonomy software, input into CM, Search • Care and feeding of taxonomies, metadata, vocabularies • Initial filter of user input, monitoring user input, answer questions • Provide input – what works and not, new terms • Technology • Develop taxonomies, vocabularies, facets • Integrate taxonomy into CM, search, applications • Activities • Information needs and behaviors – support with advanced features
Conclusion • Variety of information and knowledge structures • Important to know what will solve what • Taxonomies and Facets are foundation elements • Build higher levels based on lower levels • Glossaries to Taxonomies • Taxonomy to Ontology / faceted navigation • Important to have good taxonomy and text analytics software (spreadsheets are OK for first draft)
Resources • Books • Women, Fire, and Dangerous Things • What Categories Reveal about the Mind • George Lakoff • Knowledge, Concepts, and Categories • Koen Lamberts and David Shanks • Software • Selecting Taxonomy Software (Taxonomy Boot Camp) • Web Sites • Taxonomy Community of Practice: http://finance.groups.yahoo.com/group/TaxoCoP/
Questions? Tom Reamytomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com