390 likes | 544 Views
Amsterdam, May 28 , 2009. Your Brains in My e-Laboratory. Feasting on brains with Taverna and Semantic Web tools Marco Roos
E N D
Amsterdam, May 28, 2009 Your Brains in My e-Laboratory Feasting on brains with Taverna and Semantic Web tools Marco Roos acknowledging the AID team (Scott Marshall, Sophia Katrenko, Willem van Hage, Edgar Meij, KonstantinosKrommydas, Pieter Adriaans), Andrew Gibson, MartijnSchuemie, Piter de Boer, the myGrid team (in particular Katy Wolstencroft, Carole Goble, and Dave de Roure), OMII-UK and NBIC
A biologist in e-Science • Marco Roos • Biologist and bioinformatician • Post-doc e-(bio)science, University of Amsterdam (BioRange/VL-e) • Project or Area Liaison (PAL) OMII-UK • Member UK e-Science All Hands Foundation • Member BioAssist programme committee NBIC
Mouse fibroblast (skin) cells My primary motivationStructure and function of DNA in the nucleus Escherichia coli
/* * determines ridges in htm expression table */ #include "ridge.h" int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable) { char querystring[256]; sprintf("SELECT * FROM %s WHERE chrom = %s ORDER BY genstart", htmtablename, chromname); htmtable = PQexec(conn, querystring); return(validquery(htmtable, querystring)); } int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount) /* determines if mincount genes in a row are (part of) a ridge */ /* pre: htmtable is valid and sorted on genStart (ascending) /* post: { if (mincount<=0) return TRUE; if (row>=PQntuples(htmtable)) return FALSE; if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, "movmed39expr")) < exprthreshold) { return FALSE; } return(is_ridge(htmtable, ++row, exprthreshold, --mincount)); } int main() { PGconn *conn; /* holds database connection */ char querystring[256]; /* query string */ PGresult *result; int i; conn = PQconnectdb("dbname=htm port=6400 user=mroos password=geheim"); if (PQstatus(conn)==CONNECTION_BAD) { fprintf(stderr, "connection to database failed.\n"); fprintf(stderr, "%s", PQerrorMessage(conn)); exit(1); } else printf("Connection ok\n"); sprintf(querystring, "SELECT * FROM chromosomes"); printf("%s\n", querystring); result = PQexec(conn, querystring); if (validquery(result, querystring)) { printresults(result); } else { PQclear(result); PQfinish(conn); return FALSE; } PQclear(result); PQfinish(conn); return TRUE; } int printresults(PGresult *tuples) { int i; for (i=0; i< PQntuples(tuples) && i < 10; i++) { printf("%d, ", i); printf("%s\n", PQgetvalue(tuples,i,0)); } return TRUE; } int validquery(PGresult *result, char *querystring) { printf(" in validquery\n"); if (PQresultStatus(result) != PGRES_TUPLES_OK) { printf("Query %s failed.\n", querystring); fprintf(stderr, "Query %s failed.\n", querystring); return FALSE; } return TRUE; }
‘Old school’ bioinformatics approach Local Database Local Database
* My ws Your ws My ws My ws Your ws Virtual professor * From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34
Combining expertise Edgar Meij Information retrieval expert
Combining expertise Sophia Katrenko Machine learning expert
Combining expertise Willem van Hage Semantic web expert (and bass guitar player)
Combining expertiseTowards a knowledge framework Scott Marshall Computer scientist and bioinformatician
The AIDA toolbox, Web Services for knowledge extraction and knowledge management
e-Science collaboration AIDA toolbox
“Collaboration through Web Services” Martijn Schuemie Bio-text mining expert BioSemantics group, Erasmus University Rotterdam
“Collaboration through Web Services” Hideaki Sugawara Biological Database expert
“Collaboration through Web Services” e-bioscientist
e-Science leveraging the use of more brains Want this…
e-Science leveraging the use of more brains …need this
Workflow and Semantic Web Alpha versionof Concept Web
Biological model (representing cartoon elements) <myModel:HDAC1><rdfs:type><myModel:Protein> <myModel:Protein><rdfs:type><owl:Class>
Pseudo RDF query and results SELECT label(comment), label(query1), label(query2) FROM {protein_instance} rdf:type {bio:Protein} rdf:type {owl:Class}, {protein_instance} rdfs:comment {comment}; bioModel:isModelComponentOf {model1}; bioModel:isModelComponentOf {model2}, {representation1} mappingModel:partially_represents {model1}; methodModel:has_query {query1}, {representation2} mappingModel:partially_represents {model2}; methodModel:has_query {query2} WHERE model1 != model2
Protein references Proteinname component of discovered by Discoveryprocess run implemented by Servicerun has input run at Document Run date & time creator Creator
UniProt:P19838 references NF-KappaB component of discovered by Conditional Random FieldsProtein Name Recognition implemented by AIDA:applyCRF has input run at PMID:17540846 2008-11-1803:29:30 creator Sophia Katrenko(UvA)
Knowledge mining Knowledge mining:my knowledge is mine, your knowledge is mine
* My ws Your ws My ws My ws Your ws Demonstrate Exploiting Brains (2x) Computationalbrains Biologicalbrains * From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34
A typical biologist… Lots of data to deal with Tiny brain Lots of knowledge to deal with No computationalsuperpowers Lots of methodsand algorithms to try and combine Aneedy biologist
An enhanced biologist… Lots of data to support me Many brains Knowledge basesto query Other people’scomputationalsuperpowers Web Services, Workflows, and their creatorsavailable Anenhanced biologist
Publish and share on myExperiment.org Publish & share research objects
End of presentation... Thank you http://adaptivedisclosure.org Are you willing to share your brain?