170 likes | 232 Views
Winter presentation 26 February 2002. flexElink. Flexible linking (and formatting) management software. Hector Sanchez. Universitat Jaume I Ing. Informatica. CERN ETT-DH. Introduction. Project overview : definition, scenarios, architecture, technology. Main features.
E N D
Winter presentation 26 February 2002 flexElink Flexible linking (and formatting) management software Hector Sanchez Universitat Jaume I Ing. Informatica CERN ETT-DH
Introduction Project overview: definition, scenarios, architecture, technology Main features Benefits & results Contents Hector Sanchez 26 February 2002 @ CERN
Link in the scope of FlexElink Reference to the fulltext version or a Internet resource related to a certain bibliographic record (not necessarily an URL) Stored vs. generated links Generated links reduce considerably maintenance Link managers Know when to create a link and build them from bibliographic data Link managers@CDS: SetLink, GoDirect, Dynamic Format Introduction Hector Sanchez 26 February 2002 @ CERN
New link management tool Improvement of the formatting tool Integration of already existing LM technologies used at CDS Be able to adapt to new situations and needs Independent of the formatter Work over different types of inputs Cover all possible formatting functions needed Reduce maintenance Avoid ‘harcode’ maintenance Make it easy to use for CDS clients Project goals Hector Sanchez 26 February 2002 @ CERN
Output: Original XML record with its HTML version Input: Bunch of records in OAI MARC XML MySQL ALEPH Bibliographic DB Consultation DB ‘CERN MARC’ SQL flexElink OAI MARC XML OAI MARC XML* Scenario 1: Brief formats <oai_marc> <varfield id="041" i1="" i2=""> <subfield label="a">und</subfield> </varfiled> ... <varfield id="FMT" i1="" i2=""> <subfield label="f">h</subfield> <subfield label="g>HTML</subfield> </varfield> </oai_marc> <oai_marc> <varfield id="041" i1="" i2=""> <subfield label="a">und</subfield> </varfiled> ... </oai_marc> cv3t5 cxtm Hector Sanchez 26 February 2002 @ CERN
Output: HTML version to be displayed or PHP to be saved to a file Input: record in OAI MARC XML MySQL Consultation DB PHP file setlink output Pre-generated references inclusion flexElink OAI MARC XML HTML page Links to fulltext & references Scenario 2: Detailed formats CDS search Hector Sanchez 26 February 2002 @ CERN
internal variables individual record Behavior repository Extraction rules Link repository admins solve links Architecture overview input records Record Separator Variable Extractor Web configuration interface Text output Behavior Processor Link Manager Hector Sanchez 26 February 2002 @ CERN
OO analysis and design Implementation tools 100% open source & freeware Component based delegation & collaboration lead to a more de-coupled and re-usable software Almost any part of the system can be substituted, modified or extended without affecting the rest Technology Hector Sanchez 26 February 2002 @ CERN
Maps the values in the input OAI MARC XML records into internal variables This mapping can be configured using the Extraction Rules Tells the extraction module which values to extract from the input and to which variables it has to map them Makes the rest of the configuration independent of the input Developed for OAI MARC XML but it can be adapted to other input types (DB) by specialising the extraction module Main features: Internal variables Hector Sanchez 26 February 2002 @ CERN
OAI MARC XML extraction rules example author <varfield id="100" i1="" i2=""> name <subfield label="a"> <subfield label=“e"> editor Main features: Internal Variables <varfield id="100" i1="" i2=""> fields <subfield label="a"> <subfield label=“e"> <oai_marc> <varfield id="037" i1="" i2=""> <subfield label="a">SCAN-0009119</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Racah, Giulio</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Guignard, G</subfield> <subfield label="e">editor</subfield> </varfield> <varfield id="909" i1="C" i2="0"> <subfield label="b">11</subfield> </varfield> </oai_marc> <oai_marc> <varfield id="037" i1="" i2=""> <subfield label="a">SCAN-0009119</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Racah, Giulio</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Guignard, G</subfield> <subfield label="e">editor</subfield> </varfield> <varfield id="909" i1="C" i2="0"> <subfield label="b">11</subfield> </varfield> </oai_marc> <oai_marc> <varfield id="037" i1="" i2=""> <subfield label="a">SCAN-0009119</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Racah, Giulio</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Guignard, G</subfield> <subfield label="e">editor</subfield> </varfield> <varfield id="909" i1="C" i2="0"> <subfield label="b">11</subfield> </varfield> </oai_marc> <oai_marc> <varfield id="037" i1="" i2=""> <subfield label="a">SCAN-0009119</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Racah, Giulio</subfield> </varfield> <varfield id="100" i1="" i2=""> <subfield label="a">Guignard, G</subfield> <subfield label="e">editor</subfield> </varfield> <varfield id="909" i1="C" i2="0"> <subfield label="b">11</subfield> </varfield> </oai_marc> Hector Sanchez 26 February 2002 @ CERN
Behaviour: Describes how the input has to be processed in order to achieve desired output Support for multiple behaviours Condition: Expression that makes associated actions to be applied only if it’s TRUE for the current input record data Action: Set of statements that describes how the output has to be built (e.g. formats) if the corresponding condition is accomplished Conditions and actions are expressed using the Evaluation Language Behaviour Condition 1 Actions Condition 2 Actions Main features: Behaviours Hector Sanchez 26 February 2002 @ CERN
Specially designed for FlexElink Extensible via User Defined Functions (UDFs) Simple Knowledge Base management Allows interaction with the Link manager Re-usability of expressions through Formats Enables the access to internal variables Context-free grammar Operations that are defined in PHP Main features: Evaluation Language Hector Sanchez 26 February 2002 @ CERN
Simple behaviour example Internal Variables 100.a author name 245.a title 0248.a standard ref 909C0.b base # Main features: Behaviours Behaviour: SIMPLE $909C0.b=”27” “<b>” $245.a ”</b>” forall($0248.a){ rep_prefix(“ – “) $0248.a separator("; ") } UDFs “”=“” “<b>”$245.a”</b>” forall($100.a){ rep_prefix(“– Authors: “) $100.a separator("; ") } Hector Sanchez 26 February 2002 @ CERN
Generates links from stored rules Supports different types of link solving Independent of the formatter These rules are also expressed using the Evaluation Language External linking Just generate the link from the rules Internal linking The link is always a file, it checks the existence, access, formats, etc Can be extended: The LM is just a framework to which new linking logic can be added It has no access to Internal Variables, receives data as parameters Main features: Link Manager Hector Sanchez 26 February 2002 @ CERN
Example: simple link definition and access from EL Generation of records with already solved fulltext links Link manager call Main features: Link Manager “<b>” $245.a “</b><br>” link(“FULLTEXT”, $base, $categ, $id) { “<b>Fulltext access:</b>” forall($link){ “<a href=\”” $link “\”>[“ $link.format_id “]</a>” } } else{ “No link found” } FULLTEXT link definition Hector Sanchez 26 February 2002 @ CERN
More modular and specialised CDS Search The OO approach eases the maintenance and allows future extensibility Only one way of configuring formats and links All the configuration is kept in a DB and separated of the logic Possible to generate different configuration views Search Engine doesn’t know anything about linking or formatting Search Engine format/link config formats links query results flexElink users Benefits Hector Sanchez 26 February 2002 @ CERN
It’s already being successfully used for Speed optimisation (test over 15’000 records) Testing for future replacement of GoDirect and SetLink Pre-generated CDS Search BRIEF formats On-the-fly creation of CDS Search DETAILED formats HTML pages of the fulltext extracted references BRIEF format creation (average): 0.05 sec/record DETAILED format creation (average): 0.15 sec/record GoDirect: ‘automatically’ migrated 91% of journals Setlink: Ready for defining new fulltext rules Results Hector Sanchez 26 February 2002 @ CERN