410 likes | 512 Views
ISKOD conference, Konstanz, 20-22 February 2008. Freely faceted classification for a Web-based bibliographic archive The BioAcoustic Reference Database Claudio Gnoli, Gabriele Merli, Gianni Pavan, Elisabetta Bernuzzi, Marco Priano (University of Pavia. Dep’t Mathematics & CIBRA).
E N D
ISKOD conference, Konstanz, 20-22 February 2008 Freely faceted classification for a Web-based bibliographic archive The BioAcoustic Reference Database Claudio Gnoli, Gabriele Merli, Gianni Pavan, Elisabetta Bernuzzi, Marco Priano (University of Pavia. Dep’t Mathematics & CIBRA)
The León Manifesto • interdisciplinarity • requires some new KOS • based on phenomena • allowing to shift between perspectives • by analytico-synthetic techniques • interdisciplinarity • requires some new KOS • based on phenomena • allowing to shift between perspectives • by analytico-synthetic techniques
The heresy Disciplines ! or phenomena... sources: Hajdu Barat, Gnoli
Freely faceted classification Developed within NATO-granted CRG research for a new general scheme mainly by Douglas Foskett and Derek Austin then partially evolved into the PRECIS verbal system source: Vickery
Existing freely facetedverbal indexing systems • relational indexing [Farradane, 1950s] • Syntol [Gardin, 1960s] • PRECIS [Austin, 1970s] • POPSI [Bhattacharyya, 1980s]
Freely faceted classification • Any concept has a constant notation, and • can be combined with any other • by expressing the kind of relationship. • Concepts are not bound to disciplinary classes, but organized in classes of phenomena. [Austin 1969, Prospects for a new general classification, J. librarianship, 1, n. 3, p. 149-169]
How phenomena can be ordered ... e atoms f molecules l cells m organisms n populations s communities t institutions v cultures ... increasing organization
FFC: constant notation mqvtn2a whales in Atlantic ocean t8mqvtn institutions dealing with whales wa4mqvtnfood consisting of whales wni60mqvtn vessels damaged by whales xg8mqvtn painting of whales
FFC searching It suits computer applications, as each concept can be retrieved separately by searching for the corresponding notation. “whales”mqvtn
FFC browsing mqvtn whales mqvtn25e whales in estuaries u8mqvtn whale economy t8mqvtn institutions dealing with whales xg8mqvtn paintings of whales Results can be sorted systematically
FFC-like use of existing KOSs Traditional classifications (DDC, UDC) can be used in this way for retrieval purposes, by assigning multiple classes to a document Example: NEBIS opac [Pika]
FFC: free combinations wni60mq vessels damaged by animals mq60wni animals damaged by vessels
FFC: citation order Facets of the same relevance are cited in a standard citation order (like in classic FC) but focus facets can be promoted to the leading positions (like in Nuovo Soggettario) wa4mq29qfood consisting of animals in Japan wa29q4mq Japanese food, consisting of animals
FFC problems More freedom requires more skills... Users want simple notation (a virtue of DDC and BC2) Austin concluded that FFC was good for IR, while mark-and-park systems were good for shelving two separate systems?!
Possible solutions • Indexers can be helped bysemi-automatical classification, and • assisted by visual interfaces
Possible solutions • Notation can be shortened by extra-defined foci nycoceanic environment 25[ny]in environment 25cin oceanic environment
Possible solutions • only using letters, digits, and brackets abcd9e(5fg)8h main class facets subfacets
A property of FFC items with more facets are more retrievable (by one facet or another) paradoxically, specialized documents tend to be retrieved more often a balanced cataloguing policy is needed
Research needs The database is fed with papers actually used by the CIBRA staff in bioacoustic research,in both field recording and signal processing
Indexing interface The indexer can edit the classmark and dynamically see the caption she is producing
Suggested classes She can be helped by automatic suggestionsgenerated by matching title with DB thesaurus article title edited notation automatic caption suggested classes
Suggested classes For each title word, classes are suggested which match caption, or synonyms, or description, or discipline To improve precision • stopwords ignored: words < 4 letters, “with”, ... To improve recall • -s truncated
Verbal captions They are synthesized from notation by a PHP script
Indexing interface Interface usability is still to be improved e.g. click-and-select, drag-and-drop, automatic default citation order: gxxx kyy bzzzkyy gxx bzz
Classification by methods The León Manifesto advocates for classificationby phenomena, theories, and methods: birds, according to Darwinism, studied by observation mqvo04d03b
Classification by methods In bioacoustics, methods are relevant more often than theories while in human sciences, the opposite seems to be true [Szostak & Gnoli 2008, proc ISKO11 Montréal]
Complexity issues “Guidelines on the applications of the environment protection and biodiversity conservationact to interactions between offshore operations and largercetaceans” tn8ve(4qvtn(902o68v(3)25c))4d Much facet nesting becomes problematic even for the PHP script...
Taming complexity tn8V4d ve4V mqvtn902o68v(3)25c Deictic V refers to the whole subsequent phase,thus avoiding most brackets
Possible solutions The system can be used at various degrees of complexity,from purely free to fully faceted, according to the needs. Websites: free classification Specialized literature: freely faceted cl.
User search One or more facets can be selected... (also in combination with author/title/date)
The zero match problem [Tudhope & Binding 2008, Faceted thesauri, Axiomathes, 18, special issue on facet analysis, in prep.] birds x threat x Europe = 0 Possible solution: enable “fuzzy” search by • ignoring one facet at a time (in which order?...) • go one step up in hierarchy
Search refinement Users should be allowed to refine searchby navigating through facets and hierarchyaccording to the number of results(average futility point 30) This has been partially done already in a related archive...
Future developments • Complete fully faceted classmarks for all articles • Derive consistent indexing policies from practice • Fix automatic caption generation for complex cases • Improve the facet selection menu • Allow subject search by typing words • Make the indexing interface evolving into a real assistant tool
Conclusion Freely faceted classificationby phenomena, theories, and methodsis feasible [Szostak 2007, Proc ISKOE León]
ILC people:Claudio Gnoli, Mela Bosch, Enzo Cesanelli, Viviana Doldi, Hong Mei, Gabriele Merli, Marcella Patania, Roberto Poli, Rick Szostak, Lorena Zuccolo CIBRA people:Gianni Pavan, Elisabetta Bernuzzi, Claudio Fossati, Amanda May Koltz, Michele Manghi, Marco Priano Published reports: Gnoli & Poli 2004, Levels of reality and levels of representation, Knowl org 31, 3, 151-160 Gnoli & Merli 2005, Notazione e interfaccia di ricerca per una classificazione a livelli, AIDA informazioni, 23, 1-2, 57-72 Hong 2005, A phenomenon approach to faceted classification, 53th conf Japan Soc LIS Gnoli 2006, The meaning of facets in nondisciplinary classifications, proc 9th ISKO conf, Vienna, 11-18 Gnoli & Hong 2006, Freely faceted classification for Web-based information retrieval, New rev hypermedia & multimedia, 12, 1, 63-81 Gnoli, Bosch & Mazzocchi 2007, A new relationship for multidisciplinary knowledge organization systems: dependence, proc 8th ISKO Spain conf, León, 399-409 Gnoli 2007, “Classic” vs. “freely” faceted classification, ISKO UK meeting Ranganathan revisited, London Gnoli, Pavan, Bernuzzi, Merli & Priano 2007, Freely faceted classification for the BioAcoustic Reference Database, poster 21th IBAC conf, Pavia Szostak & Gnoli 2008, Classifying by phenomena, theories, and methods, proc 10th ISKO conf, Montréal Website: www.iskoi.org/ilc
...vielen Dank! ISKOD conference, Konstanz, 20-22 February 2008