100 likes | 218 Views
Applying Topic Maps to Genome Research. Torsten Semmler RZPD German Resource Center for Genome Research www.rzpd.de. Background. genes in their biological context proteins encoded by genes act as enzymes in reactions the defined order of such reactions is called a biological pathway
E N D
Applying Topic Maps to Genome Research Torsten Semmler RZPD German Resource Center for Genome Research www.rzpd.de
Background • genes in their biological context • proteins encoded by genes act as enzymes in reactions • the defined order of such reactions is called a biological pathway • gene sets represent those genes that play a role in a pathway • research is interested in: • all genes of a pathway • other pathways with a certain gene involved • protein, reaction, disease relation, literature, cross references to public databases
“GenomeMatrix” - adata integration method good for representing sets of genes and links to further data resources lacks of representing connections to other bio objects Methods for data integration • data (and some metadata) stored in relational databases • search interfaces and dynamic websites on top for representation
What we want to achieve • find Topic of interest • visualize the Topic and its connection to other Topics • allow easy navigation between connected Topics • every time access to related public databases for focused Topic • “Represent data in its biological context”
Data sources for the Topic Map • Biowarehouse with experimental pathway/genome data • RZPD Primary Database • MASI: Meta Annotated Sequence Investigation • flat files with pathway – gene sets
Topic Map - Structure 1 Gene Protein • geneproduct • GO classification • publication • enzymatic reaction • gene – pathway • pathway reaction 2 3 3 4 GO PubMed Reaction Overall statistics: (for 170 Pathways) Topics: 25200 Associations: 16621 Occurrences: 31940 (13 different links to external databases) Total TAOs: 73761 5 3 6 Pathway
Workflow • choosing gene of interest in GenomeCube • result with link to topicmap based information for gene, especially connections to related Pathways • easy navigating between related topics • visualization of topic and its connections available at every time • gene sets of one/more pathways passed together back to GenomeCube for ordering related products
Merging Topic Maps • possible creation of individual Topic Map for different sources of pathway-genome data • high standardization of gene identifier (Entrez Gene Id, HUGO-Symbol)
Benefits of Topic Maps • good structuring of data • qualitative and quantitative representation of relation between the bio objects • easy visualization • easy extension of ontology without changing the storage schema • Merging of Topic Maps • reuse the data for other purpose • use of Topic Map Query Language – Tolog • with Ontopia Knowledge Suite real good software available