1 / 20

Carlos Henrique Marcondes, marcon@vm.uff.br , Marília Alvarenga Rocha Mendonça,

grant. Representing and coding the knowledge embedded in texts of Health Science Web published articles ElPub2007 Conference, Vienna, Austria, June 2007. Carlos Henrique Marcondes, marcon@vm.uff.br , Marília Alvarenga Rocha Mendonça, Luciana Reis Malheiros Leonardo Cruz da Costa

reese
Download Presentation

Carlos Henrique Marcondes, marcon@vm.uff.br , Marília Alvarenga Rocha Mendonça,

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. grant Representing and coding the knowledge embedded in texts of Health Science Web published articlesElPub2007 Conference, Vienna, Austria, June 2007 Carlos Henrique Marcondes, marcon@vm.uff.br, Marília Alvarenga Rocha Mendonça, Luciana Reis Malheiros Leonardo Cruz da Costa Tatiana Cristina Paredes Santos Luciana Guimarães Pereira UFF - Universidade Federal Fluminense, Rio de Janeiro, Brazil Keywords: electronic publishing, scientific methodology, scientific communication, knowledge representation, ontologies, Semantic Web

  2. Context • Scholar electronic journals are still based on print model and do not take full advantage of facilities offered by the Web environment • Semantic Web Initiative • Web Ontologies are becoming the humanity public knowledge bases, alternative to collections in libraries

  3. Problem • Knowledge is embedded in the text of scientific articles for human reading, in an unstructured format, not adequate for program processing • Scientific communication is a slow social process which depends on discourse, text producing and reading/interpreting/inquiring until new knowledge is incorporated to the corpus of Science • IT has been applied to bibliographic information systems to improve scientific communication, providing fast notification and access to full-text documents. But IT is not yet used to directly process the knowledge embedded in the text of scientific articles

  4. Question • Is it feasible the development of a Web authoring/self-publishing tool which enables the publishing of scientific articles both as text and extracting the knowledge embedded in texts, recording it in program “understandable” format? • Knowledge extracted and recorded in program “understandable” format will enable inferences by software agents: • consistency checking and validation of new contributions to Science • rich semantic retrieval, etc • scientific discovery identification

  5. Research objetives • Extract, represent and code in program “understandable” format the knowledge embedded in texts of Health Science Web published articles • step 1 – develop a model to the scientific reasoning procedure and the knowledge content of a scientific article in program “understandable” format √ • step 2 – empirical test, validate and enhance the model by analyzing articles in Health Science √ • step 3 – develop a Web authoring/self-publishing tool which enables the extraction, marking-up and recording of knowledge as a by-product of writing-publishing a scientific an article by a scholar • step 4 – use the model to identify discoveries in Science

  6. Hypotheses • Scientific articles are highly structured pieces of texts reflecting reasoning procedures established by the Scientific Method • “The text of observational and experimental articles is usually… divided into sections with the headings IMRAD - Introduction, Methods, Results, and Discussion. This structure is not simply an arbitrary publication format, but rather a direct reflection of the process of scientific discovery”, Uniform Requirements for Manuscripts Submitted to Biomedical Journals(http://www.icmje.org) • Knowledge embedded in the text of scientific articles has the form of relations between phenomena • as, for ex: “to smoke causes lung carcinoma” • A hypothesi (from Greekὑπόθεσις) is a suggested explanation of a phenomenon or reasoned proposal suggesting a possible correlation between multiple phenomena, WikiPedia,http://en.wikipedia.org/wiki/Hypothesis

  7. Methodology • An initial model was proposed, based on the semantic elements of scientific method, as Problem, Hypotheses, Methodology, Results and Conclusion • Model was tested with 60 journal articles • 20 from Memorias do Instituto Oswaldo Cruz, http://www.scielo.br/revistas/mioc • 20 from Brazilian Journal of Medical and Biological Research, http://www.scielo.br/revistas/bjmbr • 20 about stem cells in international journals (in course) • Test results were used to enhance the Model

  8. The Proposed Model • Model of Authoring/Self-publishing Web environment • Model of scientific reasoning procedure and knowledge content of a scientific article, as an ontology … and the future development of a tool to mark-up/record this knowledge in program “undertandable” format

  9. Authoring/Self-Publishing Web environment A IMPOR Eas kjjsd dj sdk skdkl skls a fd g gfg ggfgg g Author/ scholar A IMPOR Eas kjjsd dj sdk skdkl skls a fd g gfg ggfgg g Authoring tool Scientific literature in a domain, Web published Semantic citaions A IMPOR Eas kjjsd dj sdk skdkl skls a fd g gfg ggfgg g Semantic relations Scientific article - text Knowledge represented in program readable format Web ontology (like UMLS) Semantic retrieval, validate and consistent checking tools Researcher, reader

  10. Reasoning Procedures in scientific articles • Experimental-inductive articles • Experimental-deductive articles • Theoretical-abductive articles

  11. Reasoning Procedures in scientific articles • Experimental-inductive articles • a PROBLEM is identified, with the following aspects and data; • a possible solution to this PROBLEM can be based on the following new HYPOTHESIS; • on the basis of this original HYPOTHESIS the PROBLEM has the following empirical manifestation; • we developed an EXPERIMENT to test this manifestation and it comes at the following RESULTS.

  12. Reasoning Procedures in scientific articles • Experimental-deductive articles • a PROBLEM is identified, with the following aspects and data; • in literature the previous authors/HYPOTHESES are proposed; • we choose the following previous HYPOTHESIS and test, enlarge and re-contextualize this it with the following EXPERIMENT; • the test shows the following RESULTS in this new CONTEXT.

  13. Reasoning Procedures in scientific articles • Theoretic-abdutive articles • a PROBLEM is identified, with the following aspects and data; • the previous authors/HYPOTHESES are not satisfactory to solve the PROBLEM due to the following criticism… ; • so, we propose this original HYPOTHESIS which we consider as a new pathway to solve the PROBLEM.

  14. Analysis procedure (simulating the authoring/Self-Publishing tool) CAMARA, Geni NL, CERQUEIRA, Daniela M, OLIVEIRA, Ana PG et al. Prevalence of human papillomavirus types in women with pre-neoplastic and neoplastic cervical lesions in the Federal District of Brazil. Mem. Inst. Oswaldo Cruz. [online]. Oct. 2003, vol.98, no.7 3 steps: • Step 1- Type of reasoning is identified: experimental-deductive • Step 2-“elements of knowledge” are identified in the text as the main hypothesis stated by the author: “HPV causes pre-neoplastic and neoplastic cervical lesions” • Knowledge as a relation: Antecedent: HPV Type of Relation: causes Consequent: pre-neoplastic and neoplastic cervical lesions • Step 3-Each of these elements is mapped to “Public Knowledge” - UMLS, UMLS Semantic Network* • Papillomavirus, Human • “Causes” , UMLS Semantic network relation T147 • Colonic Neoplasms, • Tumor Vírus Infections /pathology,Tumor Vírus Infections /virology *We used DECS, portuguese version of MESH – Medical Subject Headings -, the main Vocabulary in UMLS

  15. Ontology for knowledge in scientific articles Reasoning Experimental Theoretical Inductive Deductive Hypothesis Prev-Hypoth Problem Conclusions Experiment Results Measure Context: - Space - Time - Group References Title URN Type-of-Rel Consequent Antecedent

  16. Results

  17. Model potentialities – semantic retrieval • which other articles have hypotheses suggesting HPV as the cause of cervical neoplasias in women? • which articles have hypotheses suggesting causes other then HPV to cervical neoplasias in women? • which articles have hypotheses suggesting HPV as the cause of cervical neoplasias in groups others than women? • which articles have hypotheses suggesting HPV as the cause of other pathologies different from neoplasias? • which articles have hypotheses suggesting HPV as the cause of cervical neoplasias in different contexts? (not in women from Federal District, Brazil).

  18. Model potentialities • Software agents can navigate throughout a network of scientific articles published according to the model outlined and make inferences … • To semantic retrieval knowledge • To validate and consistency check of new contributions to Science • Is the knowledge in an article consistent with knowledge recorded in a public Web ontology? • To identify novelties in Science • A failure to map one or more elements of a “record of knowledge” may be an trace of a scientific discovery

  19. Open questions and future research • The model feasibility in scientific areas others than Health Sciences • The need of a taxonomy of relations in Science • Is it feasible a Sk-ML – Scientific Knowledge Markup Language? • Guidelines for the development of a Web authoring/self-publishing interactive scientific editor to implement the model

  20. Comments are welcome! http://www.professores.uff.br/marcondes marcon@vm.uff.br

More Related