1 / 15

WP13.4: ELIXIR EB-eye Feasibility Study Silvano Squizzato External Services Team (EB-eye)

WP13.4: ELIXIR EB-eye Feasibility Study Silvano Squizzato External Services Team (EB-eye). ELIXIR Work package Meeting May 20th 2008. Challenges in searching biological data. Diversity of the data sets (format, size, content…).

jereni
Download Presentation

WP13.4: ELIXIR EB-eye Feasibility Study Silvano Squizzato External Services Team (EB-eye)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WP13.4: ELIXIR EB-eye Feasibility Study Silvano Squizzato External Services Team (EB-eye) ELIXIR Work package MeetingMay 20th 2008

  2. Challenges in searching biological data • Diversity of the data sets (format, size, content…). • Many data providers already have their own search mechanism in place. • Heterogeneity of the search results (display, granularity…). • Navigation between different resources (cross-references) not always consistent. ELIXIR WP13.4 EB-eye Feasibility Study

  3. EB-eye - The search engine at EBI A fast, efficient, scalable search engine:www.ebi.ac.uk/ebisearch A single access point to all the main resources hosted at the EBI. Based on Apache Lucene technology. Exposes both a web and a web services interface. Displays search results as Google-like lists of entries. Acts as gateway to more than 40 distinct datasets (260 million entries). Presents results that are up-to-date with the data resources. Allows users to navigate the network of cross-references. Searches most of the EBI resources in one go. 3 ELIXIR WP13.4 EB-eye Feasibility Study

  4. EB-eye - Summary overview ELIXIR WP13.4 EB-eye Feasibility Study

  5. EB-eye - Data view ELIXIR WP13.4 EB-eye Feasibility Study

  6. EB-eye - Web services http://www.ebi.ac.uk/Tools/webservices/services/eb-eye ELIXIR WP13.4 EB-eye Feasibility Study

  7. EB-eye – Web services clients ELIXIR WP13.4 EB-eye Feasibility Study

  8. WP13.4 ELIXIR EB-eye Feasibility Study • Investigate the adoption of the EB-eye technology to search third party data resources. • Identify the viable methods of using the EB-eye engine in different contexts. • Integrate new data repositories in the EB-eye with designated data provider partners. • Verify the coherence of search results coming from diverse sources. ELIXIR WP13.4 EB-eye Feasibility Study

  9. Partners involved • MEROPS (Sanger, UK) • http://merops.sanger.ac.uk • An information resource for peptidases and the proteins that inhibit them. • Neil D. Rawlings at the Sanger Institute. • GPCRDB (Vriend – Neijmegen, NL) • http://www.gpcr.org/7tm • Information System for G Protein-Coupled Receptors. • G.Vriend, B. Vroling at the CMBI, Nijmegen, The Netherlands. • Ensembl • http://www.ensembl.org • Genome databases for vertebrates and other eukaryotic species. • Ensembl Genomes • http://www.ensemblgenomes.org • Extends Ensembl across the taxonomic space. • Sanger Institute • Is working to replace their current search engine exploiting the EB-eye technology. ELIXIR WP13.4 EB-eye Feasibility Study

  10. Approach I - Import third-party data into the EB-eye • Data integrated in the EB-eye existing architecture • Requires only schema-based dumps appropriate for the EB-eye. • Least expensive option. • Minimal efforts: • Data providers completely delegate the indexing / searching to the EB-eye. • The data becomes integrated with other data sets: • Quality Assurance. • Navigation through the cross-references. • MEROPS and GPCRDB fully integrated (Nov 08) • XML data dumps are automatically generated via Perl scripts and are publicly accessible. • The data providers agreed on which fields to index and cross-references make available. • Only few revision cycles necessary to have good data dumps. • Ensembl Genomes added to the EB-eye (Apr 09) • The EB-eye Web services are also used by the Ensembl Genomes web site. ELIXIR WP13.4 EB-eye Feasibility Study

  11. Approach II - Full export of the EB-eye technology • Data integrated in the EB-eye existing architecture • Hardware requirements might be expensive. • Expertise and learning curves to run and administer a full production system. • Local dependencies for the EB-eye installations: know-how and man power required to maintain the local infrastructure. • EB-eye team is collaborating withthe Sanger Institute • Support for third-party customisations. • Additional documentation for third-parties: • Software architecture. • User manuals (admin / end-user). ELIXIR WP13.4 EB-eye Feasibility Study

  12. Approach III - Export of part of the EB-eye • Partial export of the EB-eye: indexing engine • Hybrid integration model. • Searching infrastructure runs centrally at EBI. • Data providers have full control of the indexing and re-distribution of locally produced indices. • Too abstract for most users. • An attempt towards a federated search mechanism • It is not necessary since users can consume the EB-eye Web Services to integrate data into their own portals (see Ensembl Genomes). ELIXIR WP13.4 EB-eye Feasibility Study

  13. Conclusions • The EB-eye is a flexible and scalable solution for new third-party data sources. • Most effective and quick mechanism of integration: • Exporting content data into the EB-eye system. • The new data sets added to the EB-eye becomes part of a coherent chain of cross-references. • Limitations to the distribution of the EB-eye • Not available at the moment as a downloadable SW package. • Limited human resources to support tier installations. ELIXIR WP13.4 EB-eye Feasibility Study

  14. Future directions for the EB-eye interoperability • Attention to the quality of data provided • Data used by EB-eye should be up-to-date with mothership portals. • Cross-references need to be consistent between different resources to avoid: • The display of broken links to non-existing entries. • Discrepancies between different data sets. • Proposed features • External references. • Export of search results. • Running tools from the results (i.e. using Web Services). ELIXIR WP13.4 EB-eye Feasibility Study

  15. Rodrigo Lopez Mickael Goujon Franck Valentin Silvano Squizzato Acknowledgements SangerCMBI ELIXIR WP13.4 EB-eye Feasibility Study

More Related