720 likes | 1.12k Views
IntAct Molecular Interaction Database. Samuel Kerrien (skerrien@ebi.ac.uk). APO-SYS 2008 25 th June 2008 Wellcome Trust Genome Campus, Hinxton, Cambridge, UK . Outline. Lecture: The IntAct Database (45’) History of the project The data we are dealing with
E N D
IntActMolecular Interaction Database Samuel Kerrien (skerrien@ebi.ac.uk) Master headline APO-SYS 2008 25th June 2008 Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
Outline • Lecture: The IntAct Database (45’) • History of the project • The data we are dealing with • Additional resources (UniProt, Newt, OLS) • Data Standards for molecular interaction • IntAct Applications allowing you to: • Browse - Search • Visualize – HierarchView • Practical Session (60’) Things to remember for the practical session ! Master headline
DNA Protein Small Molecules RNA Genomics and Proteomics Genomics Proteomics Metabolomics Transcriptomics Functional Genomics/Proteomics What data are we dealing with ?
Why are we interested in Interactions ? • As a means of precisely understanding a protein role inside a specific cell type • Guilt by Association – it may be the only means of predicting a protein’s function • As building blocks for System’s Biology What data are we dealing with ?
IntAct goals & achievements • Define a standard for the representation and annotation of protein interaction data • provide a public repository • populate the repository with experimental data from project partners and curated literature data • provide modular analysis tools • provide portable versions of the software to allow installation of local IntAct nodes. - Curation manual available from home page- Member of the International Molecular interaction Exchange consortium (IMEx) http://www.ebi.ac.uk/intact ftp://ftp.ebi.ac.uk/pub/databases/intact 3300+ distinct publications, 169,000+ binary interactions, 63,000+ proteins imported from UniProt search & advanced search, hierarchView, pay-as-you-go, MiNe… Known installation: AstraZeneca, GSK, MERCK, MINT Proteome Center of Shanghai Master headline
Statistics Master headline IntAct at a glance
Christian Kohler Interactome coverage • Only a fraction of all published interactions is captured in interaction databases • An end is not in sight, the interaction space is still vastly under-sampled Master headline
Public data • All data is manually curated by expert curators • Curation manual rigorously followed • All curated data is reviewed by a senior curator • Topic centric dataset available (eg. Apoptosis) • All data is made available on FTP site: • (!) data updated every week • (!) format available: Data ftp://ftp.ebi.ac.uk/pub/databases/intact Master headline IntAct at a glance
Participant3 Interaction1 Protein1 Participant1 Interaction2 Experiment1 Participant2 Protein2 Interaction3 Interaction4 Experiment2 . Roles . Features. Preparations Participant How to model an interaction Publication Master headline
Interacting domains Overlay of Ranges on sequence: Data model • Support for detailed featuresi.e. definition of interacting interface Master headline
Controlled vocabularies • Why do we use them ? • e.g. more than 20 ways to write: yeast two hybrid, Y2H, 2H, two-hybrid, … • Full integration of PSI-MI ontology • Over 1,200 terms, fully defined and cross-referenced Master headline
Interaction detection methods Interactor types Controlled vocabularies • These controlled vocabularies are hierarchical;of various size and complexity. Master headline
How to deal with Complexes • Some experimental protocol do generate complex data: • Eg. Tandem affinity purification (TAP) • One may want to convert these complexes into sets of binary interactions, 2 algorithms are available: Master headline
Other useful databases • You will need to know a little more about the following databases to do the practical part of this session : • Ontology Lookup Service • UniProtKB (Universal Protein Resource) • Newt (NCBI taxonomy) Master headline
Ontology Lookup Service • Makes available OBO controlled vocabularies • Web site allows for searching and browsing their hierarchy http://www.ebi.ac.uk/ontology-lookup Master headline
Ontology Lookup Service • Each term has a definition as well as literature reference http://www.ebi.ac.uk/ontology-lookup Master headline
UniProt Knowledge Base • Swiss-Prot: Manual annotations (~300,000 proteins) • TrEMBL: Automatic (~3,300,000 proteins) http://www.ebi.uniprot.org/ Master headline
UniProt Knowledge Base • Interactions in IntAct are using Splice Variants http://www.ebi.uniprot.org/ Master headline
UniProt Knowledge Base • Summary: • Master Protein: P60953 • Splice variants / Isoform: P60953-1, P60953-2 ! http://www.ebi.uniprot.org/ Master headline
UniProt Knowledge Base • IntAct exports interaction data to UniProt. • Only interactions detected by specific methods are exported. Mostly physical -> higher quality interactions ! http://www.ebi.uniprot.org/ Master headline
Newt • Web Interface to the NCBI taxonomy Master headline
Newt Master headline
IntAct Applications • Now we are going to see : • IntAct home page • How to search data • Building complex queries • How to navigate from • protein -> interaction -> experiments -> publication • How to visualize interaction networks using: • IntAct tools • A third party application: Cytoscape Master headline
IntAct – Home Page http://www.ebi.ac.uk/intact Master headline
Software demonstration • Web application: binary search • Simple, yet powerful search engine • Binary interaction centric • Advanced search – how to build complex queries • Entry point to other applications Master headline
Details of interaction PubMed Newt UniProtKB OLS Browsing – binary search First search from the home page… Master headline
Standard columns (15): IntAct specific columns (+11): • ID(s) interactor A & B • Alt. ID(s) interactor A & B • Alias(es) interactor A & B • Interaction detection method(s) • Publication 1st author(s) • Publication Identifier(s) • Taxid interactor A & B • Interaction type(s) • Source database(s) • Interaction identifier(s) • Confidence value(s) • Experimental role(s) of interactors • Biological role(s) of interactors • Properties (CrossReference) of interactors • Type(s) of interactors • HostOrganism(s) • Expansion method(s) • Dataset name(s) PSIMITAB columns + Master headline
Browsing – binary search • Using the IntAct query language, one can also build complex queries • List of terms one can query on : First search from the home page… Master headline
Browsing – binary search • Advanced search gives access to more options… How to build complex queries… Master headline
Browsing – binary search • Advanced search gives access to more options… How to build complex queries… Master headline
Software demonstration • Web application: detailed search • IntAct original search interface • More detailed information about experiment, interaction, interactor… • Entry point to other applications Master headline
Browsing – binary search First search from the home page… Details of interaction Master headline
Browsing – binary search Viewing details of an interaction… Master headline
Browsing - search Search result for ‘RAD1’ Master headline
Protein selected Proteins known to interact with o60671_human Browsing - search Binary view of o60671_human Master headline
Browsing - search Details of a Protein Master headline
Browsing - search Binary view of rad1_yeast Master headline
Browsing - search Experiment view Interaction between rad1_yeast and sahh_yeast Master headline
Browsing - search Details of an Interaction Type • All CVs can be clicked, giving access to: • Comprehensive definition • Cross references Master headline
Browsing - search An interaction involving feature ! Master headline
Browsing - search An interaction involving feature Master headline
Software demonstration • Web application: hierarchView • 2D visualization of molecular interaction network • Interactive expansion of network • Highlight of proteins in context of their GO/InterPro annotations • Download of network in PSI-MI XML • Can be combined with third party software (e.g. Cytoscape) Master headline
Visualizing - hierarchView ! From search to hierarchView… Master headline
Visualizing - hierarchView Description of the user interface Master headline
2D interaction network Search boxsupports list of interactors Add interactions to current network Network expansionaround all selected interactor Mouse click behaviour Protein’s annotationscount of proteins sharing a termselection for highlightdisplay of GO hierarchy Download current networkcurrently PSI-MI 1.0 & 2.5 1..n selected proteins Visualizing - hierarchView Description of the user interface Master headline
Visualizing - hierarchView Expansion of the existing network Master headline
Visualizing - hierarchView Expansion of the existing network Master headline
Visualizing - hierarchView Highlight of GO annotation Master headline
Go term highlightSelect single term Select term and children Visualizing - hierarchView Highlight of GO annotation Master headline
Visualizing - hierarchView Highlight of GO annotation Master headline