1 / 24

A method to propagate permissions in biomedical data using a semantic web framework

This article discusses a method for propagating permissions in biomedical data using a semantic web framework called S3DB. It explores the evolution of data representation, the use of ontologies, and the management of heterogeneous data in the life sciences. The article also showcases examples of how S3DB's API can be used to explore and manipulate complex relationships in data.

mistyi
Download Presentation

A method to propagate permissions in biomedical data using a semantic web framework

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A method to propagate permissions in biomedical data using a semantic web framework Helena F. Deus and Jonas S. Almeida hdeus@mathbiol.org The University of Texas M. D. Anderson Cancer Center

  2. History of the web Web 1.0 Links -> Documents Web 2.0 Links -> Data Structures -> Web services Web 3.0 Links -> Web Services -> Links -> Web Services -> Links -> Web Services .…

  3. Evolution of data representation Nature Biotechnology. 2005 Vol 23 Nr 29

  4. Semantic web of data: a set of best practices

  5. A data pyramid Wisdom Knowledge OWL, OBO RDF Information XML TEXT Data Files

  6. Data management in the life sciences Clinical/Medical data MDAxxxx MDAxxxx MDAxxxx Electronic Health Records RDBMS Life is good!

  7. Heterogeneous data management Core facilities data Clinical/Medical data DNA Sequencing MDAxxxx MDAxxxx MDAxxxx Microarrays RDBMS Protein Arrays Data everywhere! Pulse Field Gel Electrophoresis

  8. S3DB Core Model PLoS ONE. 2008 Aug 13;3(8):e2946

  9. Example: TCGA data structure http://tcga.s3db.org

  10. S3DB Rule http://tcga.s3db.org/R247 Sample ?? Patient blood tumor Sample Patient S3DB Statement http://tcga.s3db.org/S234 Tissue sampleX patientY R427

  11. TCGA domain - instance

  12. SPARQL

  13. Snapshots of interfaces using S3DB’s API (Application Programming Interface). These applications exemplify why the semantic web designs can be particularly effective at enabling generic tools to assist users in exploring data documenting very specific and very complex relationships. Snapshot A was taken from S3DB’s web interface, which is included in the downloadable package. This interface was developed to assist in managing the database model and, therefore, is centered on the visualization and manipulation of the domain of discourse, its Collections of Items and Rules defining the documentation of their relations. The application depicted on snapshots B-D describe a document management tool S3DBdoc, freely available as a Bioinformatics Station module (see Figure 6). The navigation is performed starting from the Project (C), then to the Collection (B) and finally to the editing of the Statements about an Item (D). The snapshot B illustrates an intermediate step in the navigation where the list of Items (in this case samples assayed by tissue arrays, for which there is clinical information about the donor) is being trimmed according to the properties of a distant entity, Age at Diagnosis, which is a property of the Clinical Information Collection associated with the sample that originated the array results. This interaction would have been difficult and computationally intensive to manage using a relational architecture. The RDF formatted query result produced by the API was also visualized using a commercial tool, Sentient Knowledge Explorer (IO-Informatics Inc), shown in snapshot E, and by Welkin, F, developed by the digital inter-operability SIMILE project at the Massachusetts Institute of Technology. See text for discussion of graphic representations by these tools. To protect patient confidentiality some values in snapshots B and D are scrambled and numeric sample and patient identifiers elsewhere are altered.

  14. PLoS ONE. 2008 Dec;3(12):e4076

  15. Code portability and distributed data API API API SPARQL

  16. Permission management Markov Model

  17. Permission propagation

  18. Experimental evolving ontologies Upper ontologies Intermediate Ontologies Domain-Specific Ontologies MGED and others Current entry level for computation Experimental, evolving Data Models Proposed entry level for computation Raw data

  19. S3DB.ORG What is S3DB? • It is a web service that manages semantic web content distinguishing the domain of discourse from its instantiation. It was configured specifically for the needs of Biomedical Informatics projects where: • Those who submit the data keep a fine tuned control over its access and use. • The data model is deployed over a core ontology that allows its editing. • It has a distributed deployment designed to deal with heterogeneous environments. What S3DB is not? • It is not a client application. • It is not a “work in progress”: a SPARQL endpoint assures that experimental data is not kept outside of the Linked Data Web until is matures

  20. Getting out of the closet Entry point for BioPortal ontologies (Using NCBO REST web services)

  21. Next Steps – mathbiol.org NCI: AffyMicroarray GeneList stat:justRMA http://bioinformaticstation.org (MatLab) http://mathbiol.org/ (JavaScript) …..CEL EGFR, P10… Plug-and-play semantic web!!!

  22. In Conclusion • Dissolution of boundaries between data structures is a good thing… But doing it without losing the role of each data element is even better  • Some level of explicit granularity in the data is necessary to implement a permission model.

  23. Acknowledgements Jonas S. Almeida Kadir Akdemir Miriã Coelho Cintia Palú Pablo Freire The Integrative Bioinformatics Lab at the University of Texas MD Anderson Cancer Center (Houston, Tx) Instituto de Tecnologia Quimica e Biologica, Universidade Nova de Lisboa (Lisbon, Portugal) http://s3db.org

More Related