260 likes | 427 Views
The EBI search engine: EB-eye. Franck Valentin External Services group. EMBRACE Workshop CBS, BioCentrum-DTU, February 6-8, 2008. Summary. The data at the EBI What is the EB-eye? A glance at the web interface Web services for the EB-eye. ID ... AC ... DT . ID ... AC ... DT .
E N D
The EBI search engine: EB-eye Franck Valentin External Services group EMBRACE Workshop CBS, BioCentrum-DTU, February 6-8, 2008
Summary • The data at the EBI • What is the EB-eye? • A glance at the web interface • Web services for the EB-eye Web Services Course, CBS, DK
ID ...AC ... DT ... ID ...AC ... DT ... ID ...AC ... DT ... Ligand <XML> . . . </XML> <XML> . . . </XML> <XML> . . . </XML> <XML> . . . </XML> <XML> . . . </XML> <XML> . . . </XML> <XML> . . . </XML> Interpro ID : .. PARENT ID : .. RANK : .. ... The data at the EBI Array Express Web Services Course, CBS, DK
The data at the EBI • Searching the data at the EBI • Diversity and heterogeneity of the data (format, size, content…) • Most of the data providers have their own search mechanism • Heterogeneity of the search results (display, content, granularity…) • Navigation between the different resources (references) not consistent Web Services Course, CBS, DK
What is the EB-eye? • Global search mechanism • Searches most of the EBI resources in one go • Not specific to any resource • Unified searches of the EBI resources • Free-text search (unified semantic) • Basic results display (Google-like) • Simple cross reference navigation • Available on all the EBI web pages Web Services Course, CBS, DK
A glance at the web interface Web Services Course, CBS, DK
EB-eye results summary page • Organized into categories called “domains” • Number of results per domain • Refine your search • Expand/Collapse for more details Web Services Course, CBS, DK
EB-eye domain result page • Results for all the resources in a domain • A domain can contain several resources • First 3 entries displayed for each resource • View more entries for a particular resource • Hierarchy of domains • Forward search (smaller set of resources) • Backward search (wider set of resources) • Refine your search • Navigate the results pages Web Services Course, CBS, DK
EB-eye domain result page (one resource) • Basic information: ID, name, description… • Link to the main resource web site • Additional links • EB-eye internal references Web Services Course, CBS, DK
EB-eye cross-references navigation • Navigate inside the EB-eye • References context • Navigation… • Using resources explicit references • Using resources implicit references Web Services Course, CBS, DK
EB-eye Advanced Search • Accessible from all the pages • Simple search criteria • Domain specific search • Domain selection • Fields selection • References Web Services Course, CBS, DK
Web services for the EB-eye • Simple experimental API for basic operations • Basic metadata information • Basic queries (Full-text and entries) • Limited cross-references navigation • Depending on the usage, we may implement a more complex API and more functionalities Web Services Course, CBS, DK
List available domains (list only the leaves) String[] listDomains() > listDomains() … astd … ensembl emblcds embldeleted emblnew_ann_con emblnew_con emblnew_standard emblnew_wgs emblrelease_ann_con emblrelease_con emblrelease_standard emblrelease_wgs ensembl … Web services – Listing the domains Web Services Course, CBS, DK
Get number of results for a simple query int getNumberOfResults(String domain, String query) > getNumberOfResults(‘medline’, 'immunolog* nutrition') 6954 Web services – Number of results Web Services Course, CBS, DK
Web services – Get results ids List result IDs for a simple query String[] getResultsIds(String domain, String query) String[] getResultsIds(String domain, String query, int start, int size) > getResultsIds(‘uniprot’, ‘polymerase’, 0, 5) A2VB99_9VIRU Q86777_9CALI Q779J8_9VIRU Q8I944_9STIC Q8I945_9STIC Web Services Course, CBS, DK
Web services – Get referenced domains Get referenced domains in a domain or an entry String[] getDomainsReferencedInEntry(String domain, String entryId) String[] getDomainsReferencedInDomain(String domain) > getDomainsReferencedInEntry(‘ensembl’, ‘cg2102’) embldeleted emblnew_ann_con emblnew_con emblnew_standard emblnew_wgs emblrelease_ann_con emblrelease_con emblrelease_standard emblrelease_wgs go taxonomy uniprot Web Services Course, CBS, DK
Web services – Get referenced entries Get referenced entries for a domain in a particular entry String[] getReferencedEntries(String domain, String entryId, String referencedDomain) • getReferencedEntries(‘ensembl’, ‘cg2102’, ‘go’) GO:0005634 GO:0046872 GO:0008270 GO:0016319 GO:0003676 GO:0003677 GO:0045892 GO:0006350 GO:0006355 GO:0007275 GO:0007399 GO:0007402 GO:0007417 GO:0007419 GO:0003700 GO:0009791 GO:0030154 Web Services Course, CBS, DK
Web services – External cross-references List non EB-eye domains referenced in a domain String[] listAdditionalReferenceFields(String domain) • listAdditionalReferenceFields(‘msdpdb’) • CATH • PFAM • SCOP Web Services Course, CBS, DK
<MedlineCitationSet> <MedlineCitation Owner="NLM" Status="MEDLINE"> <PMID>10997935</PMID> <DateCreated> <Year>2000</Year> <Month>10</Month> <Day>04</Day> </DateCreated> … XML files Flat files ID AF030562; SV 1; linear; genomic DNA; STS; FUN; 852 BP. AC AF030562; DT 04-DEC-1997 (Rel. 53, Created) DT 03-MAR-2000 (Rel. 62, Last updated, Version 2) XX DE Fusarium venenatum clone VEN-A RAPD band generated using Operon primer DE OPW-03, sequence tagged site. . . . Db Dump file (XML) <database> <name>IntAct.Experiment</name> <description>Experimental procedures that allowed to…</description> <release>1.0</release> <release_date>2007-Feb-16</release_date> <entry_count>5697</entry_count> <entries> <entry id="EBI-77680"> … id (value stored) ID AF030562; SV 1; linear; genomic DNA; STS; FUN; 852 BP. XX AC AF030562; XX DT 04-DEC-1997 (Rel. 53, Created) DT 03-MAR-2000 (Rel. 62, Last updated, Version 2) XX DE Fusarium venenatum clone VEN-A RAPD band generated using Operon primer DE OPW-03, sequence tagged site. XX KW STS. XX OS Fusarium venenatum OC Eukaryota; Fungi; Ascomycota; Pezizomycotina; Sordariomycetes; OC Hypocreomycetidae; Hypocreales; mitosporic Hypocreales; Fusarium. XX RN [1] RP 1-852 RA Yoder W.T., Christianson L.M.; RT "Species-specific primers resolve members of the section Fusarium. RT Taxonomic status of the edible 'Quorn' fungus re-evaluated"; RL Fungal Genet. Biol. 0:0-0(1997). XX RN [2] RP 1-852 RA Yoder W.T., Christianson L.M.; RT ; RL Submitted (21-OCT-1997) to the EMBL/GenBank/DDBJ databases. RL Microbiology, Novo Nordisk Biotech, Inc., 1445 Drew Ave., Davis, CA 95616, RL USA XX FH Key Location/Qualifiers FH FT source 1..852 FT /organism="Fusarium venenatum" FT /strain="ATCC20334" . . . acc (value stored) creation_date /last_modificationdate (values non stored) description (value stored) id (value stored) <MedlineCitationSet> <MedlineCitation Owner="NLM" Status="MEDLINE"> <PMID>14216186</PMID> <DateCreated> <Year>1965</Year> <Month>02</Month> <Day>01</Day> </DateCreated> <DateCompleted> <Year>1996</Year> <Month>12</Month> <Day>01</Day> </DateCompleted> <DateRevised> <Year>2007</Year> <Month>03</Month> <Day>01</Day> </DateRevised> <Article PubModel="Print"> <Journal> <ISSN IssnType="Print">0009-8981</ISSN> <JournalIssue CitedMedium="Print"> <Volume>10</Volume> <PubDate> <Year>1964</Year> <Month>Jul</Month> </PubDate> </JournalIssue> <Title>Clinica chimica acta; international journal of clinical chemistry</Title> <ISOAbbreviation>Clin. Chim. Acta</ISOAbbreviation> </Journal> . . . . . . organism_species (value non stored) creation_date (value non stored) organism_classification (value non stored) last_modification_date (value non stored) references (non stored) issn (value non stored) volume (value stored) name (value non stored) Web services – The fields Web Services Course, CBS, DK
Web services – The fields List available (stored) fields in a domain String[] listFields(String domain) • listFields(‘uniprot’) acc_number description id name Web Services Course, CBS, DK
Web services – Get results with fields List result fields values for a simple query String[][] getResults(String domain, String query, String[] fields, int start, int size) >getResults(‘uniprot’, ‘polymerase’, [‘acc’, ‘id’, ‘description’], 0, 5) acc description id ------------------------------------------------------------------- A2VB99 Polymerase. A2VB99_9VIRU Q86777 RNA polymerase (Fragment). Q86777_9CALI Q779J8 Q0E5A0 DNA polymerase (EC 2.7.7.7). Q779J8_9VIRU Q8I944 DNA polymerase (EC 2.7.7.7). Q8I944_9STIC Web Services Course, CBS, DK
Web services – Get result fields values for entries Get result fields values for one or several entries String[] getEntry(String domain, String entryId, String[] fields) String[][] getEntries(String domain, String[] entryIds, String[] fields) >getEntry(‘medline’, ‘7605758’, [‘description’, ‘publication_date’ , ‘authors’]) description : BACKGROUND AND OBJECTIVES: Intraspinally administered alpha 2-adrenergic agonists produce analgesia in part by causing spinal acetylcholine and nitric oxide (NO) release. Clonidine-induced analgesia is enhanced by subarachnoid neostigmine and inhibited by N-methyl-L-arginine (NMLA), a blocker of NO synthesis. The authors tested whether dexmedetomidine, an alpha [...] publication_date : 1995 Mar-Apr authors : Bouaziz H. Hewitt C. Eisenach J.C. Web Services Course, CBS, DK
returns the urls configured for a field of an entry String[] getEntryFieldUrls(String domain, String entry, String[] fields) String[][] getEntriesFieldUrls(String domain, String[]entries, String[]fields) • getEntryFieldUrls(‘uniprot’, ‘Q9QUZ9_9MURI’, [‘id’]) http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-e+[UNIPROT:Q9QUZ9_9MURI]+-newId Web services – Get the urls http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-+[UNIPROT:Q9QUZ9_9MURI]+-newId Web Services Course, CBS, DK
Web services – Referenced entries from a domain List of referenced entries from a domain referenced in a set of entries String[][] getReferencedEntriesFlatSet(String domain, String[] entries, String referencedDomain, String[] fields) dict(String[][]) getReferencedEntriesSet(String domain, String[] entries, String referencedDomain, String[] fields) >getReferencedEntriesSet(‘ensembl’, [‘AAEL005345’, ‘CG2102’], ‘go’, [‘id’, ‘name’]) ‘AAEL005345’-> [GO:0016319, ‘mushroom body development’], [GO:0045892, ‘negative regulation of transcription,DNA-dependent’], [GO:0007417, ‘central nervous system development’], [GO:0009791, ‘post-embryonic development’] ‘CG2102’-> [GO:0005634, ‘nucleus’], [GO:0046872, ‘metal ion binding’], [GO:0008270, ‘zinc ion binding’], [GO:0016319, ‘mushroom body development’],] [GO:0003676, ‘nucleic acid binding’], [GO:0003677, ‘DNA binding, ... Web Services Course, CBS, DK
Web services – Links • WSDL: • http://www.ebi.ac.uk/ebisearch/service.ebi?wsdl • Documentation: • http://www.ebi.ac.uk/Tools/webservices/services/eb-eye • Feedback! • http://www.ebi.ac.uk/support/ Web Services Course, CBS, DK
Web services – Let’s play ! • 2 wrappers to hide the SOAP hassle • EBeyeWSWrapper.pm • EBeyeWSWrapper.py • Test files to play with • testEBeyeWSWrapper.pl • testEBeyeWSWrapper.py Web Services Course, CBS, DK