270 likes | 388 Views
Metadata towards an e-research cyberinfrastructure. The case of French ETDs. Summary. Introducing ARTIST and authors of this collective work Main actors operating on French ETDs Their roles; Their metadata 3 case studies Creating metadata: thinking about reusability
E N D
Metadata towards an e-research cyberinfrastructure The case of French ETDs DC 2006
Summary • Introducing ARTIST and authors of this collective work • Main actors operating on French ETDs • Their roles; Their metadata • 3 case studies • Creating metadata: thinking about reusability • Thematic survey around biodiversity: thinking vocabularies • Institutional survey: thinking ontologies • Conclusion DC 2006
Authors • Jacques DUCLOY – INIST, ARTIST • Yann Nicolas – ABES • Diane Le Hénaff - INRA • Muriel FOULONNEAU – now CCSD • Luc GRIVEL – Univ. Paris 1 • Jean-Paul DUCASSE – Univ. Lyon 2 • + several little interventions and comments DC 2006
ARTIST members • Networked workshop • Appropriation • by Research Communities • of Technologies • of Scientific & Technical Information • Community of: • Researchers & Engineers • Information Science & Information Technologies • working in research world • http://artist.inist.fr/ DC 2006
ARTIST topics • How to build • Digital Library • or scholarly publishing applications • dealing with e-science? • New approaches to make research or science in a cyberinfrastructure DC 2006
ARTIST e-Science experimentations • Scientific forum • Sample: cooperative linguistic discussion • Carl Lagoze paper about DL • What is a Digital Library ? • Scientific journal: AMETIST • Appropriation, Mutualisation, Experimentation • Digital writing • Richer on-line version • Experience becomes “reproductible” • Paper view • Scientific focus and evaluation purpose • Digital Library experimentations (metadata -> DL) • Cooperative writing: this article DC 2006
This article: metadata for e-science Not only for: • (scientific) information retrieval But also for: • + research evaluation • + federative digital libraries • + research policy oriented studies • + scientific surveys DC 2006
main entities dealing with French ETDs • Universities • And their national organization • EPST (National Research Institute) • Sample CNRS : 30,000 people • European framework • Each country gets its own organization • + European actions (Delos, DRIVER…) • Francophone framework • International framework • Networked Digital Library of Theses and Dissertations (NDLTD), • ePrints application profile DC 2006
Translation difficulties showing different approaches • Thèse = ETD • Thesis (English context) • Dissertation (US context) • Veille scientifique • Scientific survey (using informetric tools) • In order to discover innovations • Pilotage de la recherche • research policy oriented studies • Strong role of French administration DC 2006
Actor: Cyberthèses Xml / TEI ETD Word + style Jury Cyberdocs OpenOffice + XSLT PDF DC 2006
Cyberthèses metadata • Thesis document: Xml, TEI-Lite • Xml version must be readable by a human being • Metadata: • DC • ETD-ms (Electronic Thesis & Dissertation Metadata Standard) • Further related axis: • TEI header, • Latex -> MathML DC 2006
Actor: ABES Ministry of Education Agency Preservation Dissemination (CINES) Star Union catalog University: ETD Sudoc portal Persistent Identifier DC 2006
Abes / Star metadata • TEF (Thèses Electroniques Françaises) • AFNOR standard (French member of ISO) • Dublin Core Qualified • With several ETD adaptations (jury…) • METS Rights • Using Schematron DC 2006
Actor: CCSD • Centre pour la Communication Scientifique Directe Researcher author TEL Inserm Local Archive … HAL: National Archive International Archive (Arxiv, Driver…) Thematic Archive Researcher reader DC 2006
CCSD metadata • At the beginning: local schema • Strong relationship between: • Author • University, laboratory, research team • DC export / OAI - PMH DC 2006
Hal: institutional repository • A French advantage related to open archive • « Protocole d’accord Universités EPST sur les dépots par les chercheurs » • In CNRS each researcher must produce an activity report which in generated by Hal/CCSD • Some scientific headers can request a CCSD deposit DC 2006
Actor: INIST/CNRS • Institut de l’Information Scientifique et Technique • http://www.inist.fr • Pascal & Francis bibliographic data bases • 15,000,000 XML records • Scientific Portals, • Scientific information analysis • Vocabularies: termSciences DC 2006
INIST - metadata • Bibliographic records • Exodic • Origin: CCF based MARC format • Translated in SGML in 92 • now with a DCQ approach • Strong links between authors and affiliations • termSciences • ISO 16642 (TMF) DC 2006
The landscape we would like to have Thesis Articles Cyberthèses CCSD INIST STAR OAI-PMH Local archive DC 2006
Case study: sharing theses and their metadata • Thinking about reusability by several actors which interoperate during ETD life cycle Inra Inra unit Univ. Lab. ETD Ecole doctorale University Abes/star DC 2006
Sharing metadata • Administrative metadata are requested for a quite complex workflow • Contents must be matched • A given person could have different names… • Different ways of naming units… • Different classification schemes… DC 2006
Case study 2: BiodivERsA • BiodivERsA: European research policy network about biodiversity • x * 10 funding agencies, • y*100 research program • z * 1000 projects • x1 * 10000 results … • distributed network of CRIS • CRIS: Current Research Information System DC 2006
CRIS… DL Thematic DL Archive Vocabulary adaptations Global DL Thematic CRIS Thematic Archive Classification schemas must be matched for computation purpose (funding evaluation) DC 2006
Affiliation must be managed with ontologies Case study 3 UHP CNRS INRIA CRIN Loria Inria Lorraine Inria Sophia YT Orpailleur Cortex Oméga DC 2006
A technical conclusion • Metadata (DCQ) is good but not sufficient • We need • vocabulary adaptations, ontologies • Sharing several repositories (vocabularies, affiliations…) • Managing metadata history • etc DC 2006
The very conclusion • We need to help people working altogether and doing compromises • We need researcher becoming owners of their scientific information system… • We need librarians appropriating technologies and helping researcher to appropriate librarian feeling • We need engineers in computer science appropriating library and edition issues That is what we try to help to do with ARTIST DC 2006
Thank you for your listening • Thank you for your questions… DC 2006