1 / 25

ESCRIRE: Embedded Structured Content Representation In Repositories

ESCRIRE: Embedded Structured Content Representation In Repositories. Jérôme Euzenat INRIA Rhône-Alpes Jerome.Euzenat@inrialpes.fr. ESCRIRE: Motivations. Embedding a simplified but formal representation of content in documents : • search on structured criteria;

zazu
Download Presentation

ESCRIRE: Embedded Structured Content Representation In Repositories

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ESCRIRE:Embedded Structured ContentRepresentation In Repositories Jérôme Euzenat INRIA Rhône-Alpes Jerome.Euzenat@inrialpes.fr

  2. ESCRIRE: Motivations • Embedding a simplified but formal representation of content in documents : • • search on structured criteria; • • document comparison (genericity, similarity…); • • automatic classification and organisation.

  3. Knowledge based queries • (and book (about "Agatha Christie")) • vs. book AND "Agatha Christie" • (and flat (location  "Alps")) • …including those in Val d’Isère! • (and bookshop (location "London")) • …bookstore included.

  4. Query languages • level 3 Semiotic • level 2 Semantic (F-logic, Escrire…) • level 1 Structural (SQL, XQL) • level 0 Full-text search

  5. ESCRIRE: Goals • Comparison of several knowledge representation techniques • in order to find the type of situation to which they are most suited (indexing, classifying, filtering…).

  6. ESCRIRE: Consortium • “Coordinated research action (ARC)” involving • Acacia (Sophia-Antipolis): conceptual graphs • Sherpa/Exmo (Rhône-Alpes): object-based representations • Orpailleur (Lorraine): terminological logics. • Usinor: application.

  7. ESCRIRE: Acquisition Tr-schema “Ontology” Global analysis XML document Integration Description Tr-object Individual analysis Document

  8. ESCRIRE: Queries Tr-schema “Ontology” XML document Tr-query Troeps Query helper XML document

  9. ESCRIRE: Problem statement Given: A set of (HTML) documents annotated by a description of their content in a pivotal langage An ontology of the domain A set of queries about the subject. Retrieve: the adequate documents.

  10. ESCRIRE: Software variation • Knowledge representation + query evaluation • Translated from a pivotal language in • Conceptual graphs, Object-based representation, Description logic • Translated by hand in CG, OKR, DL

  11. ESCRIRE: Quantitative criteria • • Precision: rate of correct answers • • Recall: rate of complete answers • • Acuracy=(precision+recall)/2 • • Performances in time • • Coverage of the query language • • Ordering of answers

  12. ESCRIRE: Qualitative criteria • Given by external users (query designers): • • Naturalness of queries • • Adequacy of answers • • Overall appreciation (aggregation).

  13. ESCRIRE: Scaling • Multiplying the size by orders of magnitude: • • Corpus • • Ontology • • Queries.

  14. ESCRIRE: Reference comparisons • • Dublin core metadata • • Full-text search

  15. ESCRIRE: Ontology elements (1) • <esc:ontology> • <esc:defclass name="gene"> • <esc:classref name="adn-part"/> • <esc:defattribute name="length"> • <esc:typeref name="integer"/> • </esc:defattribute> • <esc:defattribute name="protein"> • <esc:classref name="protein"/> • </esc:defattribute> • </esc:defclass> • …

  16. ESCRIRE: Ontology elements (2) • <esc:descrelation name="interaction"> • <esc:relref name="bio-process"/> • <esc:defattribute name="effect"> • <esc:typeref name="string"/> • </esc:defattribute>… • <esc:defrole name="promoter"> • <esc:classref name="gene"/> • </esc:defrole>… • </esc:descrelation>… • </esc:ontology>

  17. ESCRIRE: Content descriptions • <esc:content ontology="biointer.xml" url="."> • <esc:object type="gene" id="bcd"/> • <esc:relation type="interaction"> • <esc:attribute name="effect"> • inhibition • </esc:attribute> • <esc:role name="promoter"> • <esc:objref id="Bcd"/> • </esc:role> • </esc:relation>… • </esc:content>

  18. ESCRIRE: Knowledge embedding • <html>… <!-- xhtml --> • <rdf:RDF> • <rdf:Description about="/"> • <!-- dublin core --> • <dc:title>…</dc:title>… • <!-- pivot language --> • <esc:content>… </esc:content> • <!-- conceptual graphs --> • <gc:graphs>…</gc:graphs> • … • </rdf:Description>… • </rdf:RDF>… • </html>

  19. ESCRIRE: Queries • • Stated on objects, but results are documents • (concerning these topics) • • Document similarity by content similarity

  20. ESCRIRE: Query language • SELECT / FROM / WHERE / ORDERBY • + • AND / OR / NOT / ALL / EXISTS • <path> <relop> <path>|<value> • IN <class> • ALIKE <document>

  21. ESCRIRE: Corpus 1 • Subject: genetic interaction • Text source: MedLine abstracts • Annotations: manual • Ontology: Knife knowledge base + other

  22. ESCRIRE: Corpus 2 • Subject: Psychological stress • Text source: MedLine abstracts • Annotation: manual annotations • Ontology: UMLS/MeSH

  23. ESCRIRE: Where are we? • • Building translators from pivot to actual formats • • 1st part of Corpus 1 available (other data shall folow quikly)

  24. ESCRIRE: Calls • • Other corpora • • Natural language technology • • Other representation systems • starting from september 2000

  25. For more information… • http://escrire.inrialpes.fr/ • Jerome.Euzenat@inrialpes.fr

More Related