540 likes | 771 Views
Meccano, molecules, and the organization of knowledge. The continuing contribution of S.R. Ranganathan. The impact of facet analysis:. facet analysis has become pervasive over recent years
E N D
Meccano, molecules, and the organization of knowledge The continuing contribution of S.R. Ranganathan
The impact of facet analysis: • facet analysis has become pervasive over recent years • today there are few formal knowledge organization systems that do not display some elements of faceted structure • there is an evident faceted approach to product information in many commercial websites • the idea of facet analysis may be very differently understood by various communities Vanda Broughton, School of Library, Archive & Information Studies, UCL
Example from DDC: 782 .6 Women’s voices .66 Soprano voices (Treble voices) .67 Mezzo-soprano voices .68 Contralto voices (Alto voices) .7 Children’s voices .76 Soprano voices (Treble voices) .77 Mezzo-soprano voices .78 Contralto voices (Alto voices) .79 Changing voices .8 Men’s voices .86 Treble and alto voices .87 Tenor voices .88 Baritone voices .89 Bass voices Here the age or gender of the singer is subdivided by the pitch of the voice, and this subdivision of the one to the other is carried out absolutely consistently and predictably. Although there is no number building here, and the numbers are not arrived at synthetically, the citation order is very evidently ‘age/gender – pitch’ and it is applied without exception. Simple ‘faceted’ structures of this type are very common in DDC and in LCC although we don’t usually think of them as faceted schemes in any theoretical sense. Vanda Broughton, School of Library, Archive & Information Studies, UCL
Example from UDC: Vanda Broughton, School of Library, Archive & Information Studies, UCL
Examples of research projects: Vanda Broughton, School of Library, Archive & Information Studies, UCL
Vanda Broughton, School of Library, Archive & Information Studies, UCL
Vanda Broughton, School of Library, Archive & Information Studies, UCL
Vanda Broughton, School of Library, Archive & Information Studies, UCL
Facet analysis in the e-commercial environment: • many of these tools fail to employ facet analysis in other than a ‘top-level’ manner • effectively, it’s used to create a taxonomy based on a variety of attributes of entities • such a structure is logical, predictable and well modelled • it provides a good mechanism for searching by successive filtering • for the most part, concepts other than entities are not involved • hence, only a partial view of a domain is provided Vanda Broughton, School of Library, Archive & Information Studies, UCL
What is ‘classical’ facet analysis: • a means of organizing the concepts in a subject domain • involves grouping concepts on the basis of shared characteristics • uses standard categories as ‘receptacles’ for concepts • in ‘classical’ facet analysis these are linguistic/functional categories Vanda Broughton, School of Library, Archive & Information Studies, UCL
Types of categories used: • earliest set of categories was that of Kaiser (1911) • these were used in alphabetical subject indexing to generate pre-coordinated headings • concretes (= things/entities/substances) • processes (= activities/actions) • place Vanda Broughton, School of Library, Archive & Information Studies, UCL
Ranganathan’s categories: • personality (= entities or systems) • matter (= substances) • energy (= actions or activities) • space • time Vanda Broughton, School of Library, Archive & Information Studies, UCL
thing kind part property material process operation patient product by-product agent space time CRG categories: an expansion of Ranganathan’s PMEST: Vanda Broughton, School of Library, Archive & Information Studies, UCL
These categories have been used in all classes in BC2: • they work well for concepts in most subjects • they work best with science and technology • some additional categories are needed in the arts e.g. form and genre Vanda Broughton, School of Library, Archive & Information Studies, UCL
Modelling a subject domain: • categorization alone does not make a faceted system • it must also deal with the relationships between concepts • attention should also be paid to sequence or order of concepts in combination • the last is less vital in a digital context, but still important where any sort of linear order or display is needed Vanda Broughton, School of Library, Archive & Information Studies, UCL
Molecular models as a pattern: • molecular modelling provides a useful modern equivalent to Ranganathan’s Meccano analogy • molecular systems also try to represent: • the nature of the components • the relationships between them • their relative positions • internal and external relationships between particles • a syntax or rules for combination Vanda Broughton, School of Library, Archive & Information Studies, UCL
Vanda Broughton, School of Library, Archive & Information Studies, UCL
Vanda Broughton, School of Library, Archive & Information Studies, UCL
Vanda Broughton, School of Library, Archive & Information Studies, UCL
Ontology with concepts and relationships: Vanda Broughton, School of Library, Archive & Information Studies, UCL
Faceted approach with categorized concepts: does lives in wears Vanda Broughton, School of Library, Archive & Information Studies, UCL
Relationships between concepts can be expressed in different ways: • through facet indicators • through relationship indicators • through the sequence of concepts, or citation order • different faceted languages utilise all of these methods Vanda Broughton, School of Library, Archive & Information Studies, UCL
The Colon Classification: • uses the fundamental categories (P M E S T) • uses facet indicators in the form of punctuation symbols to denote the categorical status of a concept • .T .S :E ;M ,P • uses a facet formula to combine and express complex content • l [P] [P2] [P3] : [E] 2P : [2E] • Y [P] : [E] : [2E] . S . T Vanda Broughton, School of Library, Archive & Information Studies, UCL
Rules for joining concepts together in language (syntax): • sometimes meaning is achieved by inflection: • homo mordet canem • canis mordet hominem • o anqropoV esqiei ton kuonta • o kuwn esqiei ton anqropon Vanda Broughton, School of Library, Archive & Information Studies, UCL
Sometimes meaning is achieved by word order: • man bites dog • dog bites man • man eating sausage Vanda Broughton, School of Library, Archive & Information Studies, UCL
Indexing languages can function in both of these ways: • Some (like Colon, PRECIS, or UDC) use role operators or facet indicators = symbols which indicate their status • others (like BC2) rely on order in the schedule to give meaning to the components Vanda Broughton, School of Library, Archive & Information Studies, UCL
Relational operators in indexing systems: • Farradane’s system • Clamping of hardened steel plates • steel /: plates /- clamping /; hardening • /: causation or dependence • /- reaction • /; association Vanda Broughton, School of Library, Archive & Information Studies, UCL
In many modern faceted systems the means of combining terms is controlled by the citation order: • citation order is the order of categories with which we’re familiar • i.e. thing - kind - part - etc…. • this is the so-called ‘standard’ citation order • facet status determines the combination, but this is implicit in the notation • it’s a good guide to the best default order of combination, but isn’t immutable Vanda Broughton, School of Library, Archive & Information Studies, UCL
Facet analysis as a fundamental theory for structuring subject organization tools: • facet analysis provides us with a sufficiently rigorous model • we can convert this to alternative formats • work at UCL has looked at using modelling in combination with markup to create an all-purpose terminology • held as a database this can be output as: • a conventional classification • an alphabetical subject index • a thesaurus Vanda Broughton, School of Library, Archive & Information Studies, UCL
Facet analysis as a basis for classificatory structures: • is a well established methodology • organizes concepts in a domain into facets, and then into sub-facets (or arrays) • within a facet, relationships of hierarchy are identified and visually displayed • synonyms (or near synonyms) are collocated, and controlled by means of the notation Vanda Broughton, School of Library, Archive & Information Studies, UCL
Basic classification structure: Facet label [Foods] (By physical state) HKH PO Essences HKH PP Extracts HKH PS Pastes HKH PY (By operation/process used) (By utility, etc.) HKH QD Convenience foods HKH QE Partly prepared foods HKH QF Instant foods HKH QK Artificial foods, synthetic foods (By purpose) (By physiological function) HKH QS Roughage Hierarchical relationships Array labels Collocation of synonyms Vanda Broughton, School of Library, Archive & Information Studies, UCL
Sometimes content becomes much more complex: Vanda Broughton, School of Library, Archive & Information Studies, UCL
Complex repeating structure can be accurately constructed from syntax rules in a faceted system: HUQ W Thymus gland (Physiology) HUQ WH (Pathology) (Hyperplasia) HUQ WMD V Lymphatism, status lymphaticus (Causal agents) (Symptoms) (Treatment) (Neoplasms) HUQ WME Thymomas (Products) HUQ X Thymus hormones (Molecular structure) HUQ XS Thymopoietins [Compound terms pre-synthesized and added to published schedule] [Examples of potential synthesized compounds] Vanda Broughton, School of Library, Archive & Information Studies, UCL
Conversion to thesaurus format: • all of the conceptual elements required to generate a thesaurus are implicit in the schedule • BT/NT or intra-facet (paradigmatic) relationships (and some RTs) can be determined from the hierarchy • other RTs can be identified from inter-facet (syntagmatic) relationships • equivalence relationships are present in the synonym collocations Vanda Broughton, School of Library, Archive & Information Studies, UCL
Facet analysis aids the accurate identification of paradigmatic and syntagmatic relationships in the thesaurus:
Intra-facet (paradigmatic) relationships in a basic schedule: HKH PY (By operation/process used) (By utility, etc.) HKH QD Convenience foods HKH QE Partly prepared foods HKH QF Instant foods HKH QK Artificial foods, synthetic foods Convenience foods NT Partly prepared foods Partly prepared foods BT Convenience foods NT Instant foods Convenience foods RT Artificial foods Artificial foods UF Synthetic foods Synthetic foods USE Artificial foods Vanda Broughton, School of Library, Archive & Information Studies, UCL
Automatic conversion from classification to thesaurus: • BC2 has a suite of programs to generate schedule display and the A/Z index • these have been extended to allow for automatic thesaurus generation • each term is marked up to show its hierarchical position and position in the sequence, and its ‘class’ status • some difficulties occur as a result of the schedule not having been written with the thesaurus in mind Vanda Broughton, School of Library, Archive & Information Studies, UCL
BC2 source file markup for schedule display and indexing: CLG 06Aluminium, aluminum CLGLNM 07)Compounds with silicon & oxygen( CLGLNMIFN 08Aluminium silicate CLGM 07)Compounds with oxygen( @ 08)Salts( ]IT CLGMIFN 09Aluminates CLGMJHN 08Aluminium oxide, alumina @ 07)Compounds with oxygen & hydrogen( ]IT CLGMKJHN 08Aluminium hydroxide, alumina trihydrate, hydrated aluminium oxide Vanda Broughton, School of Library, Archive & Information Studies, UCL
BT/NTs inferred from the source file: Vanda Broughton, School of Library, Archive & Information Studies, UCL
RTs derived from source file (but not inferred by software: Vanda Broughton, School of Library, Archive & Information Studies, UCL
Equivalence relationships inferred from source file: Vanda Broughton, School of Library, Archive & Information Studies, UCL
Inter-facet (syntagmatic) relationships in a complex schedule: HUQ W Thymus gland HUQ WH (Pathology) (Hyperplasia) HUQ WMD V Lymphatism, status lymphaticus (Neoplasms) HUQ WME Thymomas (Products) HUQ X Thymus hormones HUQ XS Thymopoietins Thymus gland RT Lymphatism RT Thymomas RT Thymus hormones Vanda Broughton, School of Library, Archive & Information Studies, UCL
Syntagmatic relationships: • in theory the relationship between a class and a sub-class created by combination with another facet = RT or associative term • these can theoretically be identified by the presence of ‘non-classes’ • in the BC2 terminologies these relationships are more precise than in current thesaurus practice • e.g. entity-process, entity-product, agent-operation, etc. • currently these cannot be inferred Vanda Broughton, School of Library, Archive & Information Studies, UCL
FATKS: Facet analytical theory in knowledge structures: • a project carried out at SLAIS • an attempt to design a classification that could be managed automatically • a database was built to hold the classification data • this included the hierarchical position of each class, and its containing category • this would enable us to create compound classmarks from extracted terms or keywords Vanda Broughton, School of Library, Archive & Information Studies, UCL
FATKS macrostructure: Vanda Broughton, School of Library, Archive & Information Studies, UCL
Vanda Broughton, School of Library, Archive & Information Studies, UCL
What FATKS can do: • represent hierarchical position • represent categorical status • support search and navigation of the vocabulary • allow automatic synthesis through inbuilt syntax • has potential to do more than this Vanda Broughton, School of Library, Archive & Information Studies, UCL
Conclusion: • faceted terminologies built on the classical model address: • functional status of concepts • paradigmatic and syntagmatic relationships • ordering • rules for combination • all the structural elements of a faceted thesaurus are implicit in a faceted classification • many of the elements and relationships can be inferred automatically • others have the potential to be recognized but need further identification in the source data • automatic classification can be supported, and the automatic generation of populated structures • currently it is not possible to represent great complexity of structure even though this is regular and predictable Vanda Broughton, School of Library, Archive & Information Studies, UCL
Vanda Broughton, School of Library, Archive & Information Studies, UCL