260 likes | 378 Views
CH1: Content Analysis knowledge organization. Web is too big to organize?. One billion pages. 1.5 million pages added daily. Why organize in the internet?. Even if you only have a few hundred files, finding them again can take ages. Media archives have millions of files.
E N D
Web is too big to organize? • One billion pages. • 1.5 million pages added daily.
Why organize in the internet? • Even if you only have a few hundred files, finding them again can take ages. • Media archives have millions of files. • Footage/recordings/documents that can’t be found have no value. • Free text search only takes you so far .
Why not just use Google? • Synonyms and misspellings. • Disambiguation . • Imperfect knowledge . • Meaning beyond the words. • Comprehensiveness. • Audio-visual assets.
Professional knowledge organization a core function of the information professional: • - to avoid chaos! • how many published items? • In the US + UK 2005: 378,000 • how many resources on the web? • January 2007: 106,875,138 January 2009 : 9 billion? • - to present resources in an orderly and predictable manner. • - to enable access to specific content. • - to aid retrieval of specific items. • - to support exchange of information through the use of standard formats.
How do users look for information? Retrieval function of KO (knowledge organization) • users may search for specific items - known item retrieval. • they may search for items characterized by some particular feature. • books by a certain author, document forms, etc. • they may look for specific information. Browsing function of KO(knowledge organization) • they may want to see what is available. • they may not know what terms to use.
How does knowledge organization support these two approaches? • the processes of enabling access to knowledge:- • labelling resources. • classification • indexing • tagging • building vocabularies. • creating formal records to represent resources. • cataloguing • bibliographic description • metadata schemes • creating systematic structures to hold information. • Classifications:- • taxonomies • concept and topic maps • ontologies
Labelling resources: • adding information to a resource about its subject content • classification • classification schemes and codes • subject cataloguing • subject heading lists • indexing • controlled vocabularies, thesauri, keyword lists • metadata schema • tagging • usually uncontrolled
Creating formal records to represent items • listing characteristics of an item that represent it • what it’s called? name, title • who created it? author, creator • who published it? publisher (commercial, institutional, personal) • when and where? place of publication, web address • what’s it about? subject descriptors, classification codes • physical attributes? size, dimensions, file type, references, illustrations • representing these as fields in a database or equivalent structure • using rules to ensure conformity of entries
Systematic structures for the ordering of knowledge • sometimes there is a need to present information in a structured way. • physical organization: materials in a physical collection. • listing: presentation of items such as a subject bibliography or index. • display: browsing interface of a digital collection.
Systematic structures for the ordering of knowledge • it will be necessary to group items according to subject. • this is often described as classification or categorization. • the structure can be linear (as in a classification). • the structure can be two-dimensional (as in a concept map). • hypertext can be used to represent different levels of a hierarchy (as in taxonomies).
How do we go about making a KO structure? • don’t muddle the design of the interface with the structure of the information. • data must be well structured to support browsing and retrieval. • the sequence of topics must be logical. • the relationships between topics must be clear. • overall the structure must be understandable and predictable.
Top-down and bottom-up classifications • traditionally classifications were made by repeated subdivision of classes into smaller and smaller units. • this tends to create rather rigid and abstract classifications. • modern methods tend to work by clustering or grouping concepts to form classes. • this method creates more flexible systems, more closely related to reality.
Sorting and grouping • this is the first stage in organizing a collection of objects or concepts. • different attributes may be used as the basis of the classification. • a whole variety of different (but quite valid) classifications can be made by varying the criteria for arrangement.