110 likes | 189 Views
Towards an ontology to ElPub/Sci-X: a proposal. Sely M S Costa Claudio Gottschalg-Duque University of Brasilia, Brazil selmar@unb.br klauss@unb.br. Towards an ontology to ElPub/Sci-X: a proposal. 2006 (10 years) : the motivation Quantitative aspects Authors productiveness
E N D
Towards an ontology to ElPub/Sci-X: a proposal Sely M S Costa Claudio Gottschalg-Duque University of Brasilia, Brazil selmar@unb.br klauss@unb.br ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal 2006 (10 years) : the motivation Quantitative aspects • Authors productiveness • Changes in authorship • Papers per country • Works per year Qualitative aspects • Most approached themes • Most approached environment • Recent focus ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal 2006 (10 years) : the difficulties Quantitative aspects • Authors names • Institutions names • Lack of standard data • affiliation - institution hierarchy? • city x state x country • sessions x tracks Qualitative aspects • Lack of data & of standard data (keywords, abstracts) ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal Ten years of ElPub: Standardisation of names (authors and institutions) However Not yet aggregated to the collection Moreover The need of standardising keywords (yes!), abstracts (maybe) ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal • Theproblem: ElPub/Sci-X database, as a collection of whatever is found in the proceedings • One of the solutions: a standard ontology language (ElPub/Sci-X Ontology) ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal The project aim: To create an ontology that will help the exploration of ElPub/Sci-X content in both quantitative and qualitative ways ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal The work is comprised of: • File conversion • Natural language processing • Ontology creation and editing ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal • Visit Sci-X site and collect the entire collection of ElPub papers • Transfer the collection into a native database • Manually extract titles, author’s and institution’s names, as well as keywords • Replace authors and institution names in the native database by the canonical names created by Costa et al (2006) ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal • Convert all pdf files into txt files • Send the texts (no abstract and references) to a syntactic analyser (Syntactic Parser VISL), which generates a syntactic tree with all syntactic tags • Send the syntactic tree to GeraOnto (Gottschalg-Duque, 2005), which extract the concepts • Insert the concepts into Protegé, which edits the ontology ELPUB 2007 Vienna, Austria - June 2007
Towards an ontology to ElPub/Sci-X: a proposal Thank you for your attention • No questions, please! • Suggestions, welcome!!! • Future work being presented next year. ELPUB 2007 Vienna, Austria - June 2007