420 likes | 565 Views
19.09.2014. Master title. Molecular Interactions and Pathways Sandra Orchard EMBL-EBI. 5. EBI is an Outstation of the European Molecular Biology Laboratory. Why is it useful to study PPI interactions, networks and pathways?.
E N D
19.09.2014 Master title • Molecular Interactions and Pathways • Sandra Orchard • EMBL-EBI 5 EBI is an Outstation of the European Molecular Biology Laboratory.
Why is it useful to study PPI interactions, networks and pathways? • Proteins are the workhorses of cell and all their activities are controlled through interactions with other molecules. • To understand the biology of a single protein, you have to study its interacting partners • Network/pathway analysis increasingly used as a tool to annotate large data sets – proteins involved in a common process tend to cluster and be present in the same pathway
Why are there so many issues with interaction data? • Wide variety of methods for demonstrating molecular interactions – all have their strengths and weaknesses 2. No single method accurately defines an interaction as being a true binary interaction observed under physiological conditions
Why do we need interaction databases • Issues with all interaction data – true picture can only be built up by combining data derived using multiple techniques, multiple laboratories • Problematic for any bench researcher to do – issues with data formats, molecular identifiers, sheer volume of data • Molecular interaction databases publicly funded to collect this data and annotate in a format most useful to researchers
Why are data standards essential Prior to 2003, many databases= many formats. User must reformat when merging data File conversion inevitably leads to data loss Many formats compromised tool development – each tool developed tended to be database specific
PSI-MI XML format • Community standard for Molecular Interactions • XML schema and detailed controlled vocabularies • Jointly developed by major data providers: BIND, CellZome, DIP, GSK, HPRD, Hybrigenics, IntAct, MINT, MIPS, Serono, U. Bielefeld, U. Bordeaux, U. Cambridge, and others • Version 1.0 published in February 2004The HUPO PSI Molecular Interaction Format - A community standard for the representation of protein interaction data.Henning Hermjakob et al, Nature Biotechnology 2004, 22, 176-183. • Version 2.5 published in October 2007Broadening the Horizon – Level 2.5 of the HUPO-PSI Format for Molecular Interactions;Samuel Kerrien et al. BioMed Central. 2007. 6
PSI-MI XML benefits • Collecting and combining data from different sources has become easier • Standardized annotation through PSI-MI ontologies • Tools from different organizations can be chained, e.g. analysis of IntAct data in Cytoscape. Home page http://www.psidev.info/MI 7
Controlled vocabularies www.ebi.ac.uk/ols
IMEx • Consortium of 9 molecular interaction databases dedicated to producing high quality, annotated data, curated to the same standards • Data is curated once at a single centre then exchanged between partners • Users need only go to a single site to obtain all data • www.imexconsortium.org
IntAct goals & achievements • Publicly available repository of molecular interactions (mainly PPIs) - ~305K binary interactions taken from >6,200 publications (December 2012) • Data is standards-compliant and available via our website, for download at our ftp site or via PSICQUIC • Provide open-access versions of the software to allow installation of local IntAct nodes. http://www.ebi.ac.uk/intact ftp://ftp.ebi.ac.uk/pub/databases/intact www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml
Sanity Checks(nightly) reject Public web site . exp annotate accept FTP site p2 I p1 IMEx check CVs Curation manual Mint DIP MatrixDB report report Super curator curator IntAct Curation “Lifecycle of an Interaction” Publication (full text) Master headline
UniProt Knowledge Base Interactions can be mapped to the canonical sequence… .. to splice variants.. http://www.ebi.uniprot.org/ .. or to post-processed chains
Interacting domains Overlay of Ranges on sequence: Data model • Support for detailed featuresi.e. definition of interacting interface
How to deal with Complexes • Some experimental protocol do generate complex data: • Eg. Tandem affinity purification (TAP) • One may want to convert these complexes into sets of binary interactions, 2 algorithms are available:
IntAct – Home Page http://www.ebi.ac.uk/intact
Interaction detail Choice of UniProtKB or Dasty View Details of interaction PubMed/IMEx ID
Viewing Interaction Details Additional information
Visualization Applying a better graph layout… Master headline
Reactome is… Extensively cross-referenced Tools for data analysis – Pathway Analysis, Expression Overlay, Species Comparison, Biomart… Used to infer orthologous events in 20 other species
Using model organism data to build pathways – Inferred pathway events Direct evidence PMID:5555 PMID:4444 Direct evidence human PMID:8976 mouse Indirect evidence PMID:1234 cow
BINDING DEGRADATION DISSOCIATION DEPHOSPHORYLATION PHOSPHORYLATION CLASSIC BIOCHEMICAL TRANSPORT Theory - Reactions Pathway steps = the “units” of Reactome = events in biology
Reactions Connect into Pathways CATALYST CATALYST CATALYST INPUT OUTPUT INPUT OUTPUT OUTPUT INPUT
Data Expansion – Projecting to Other Species Human B A A + ATP -P + ADP Mouse B A A + ATP + ADP -P Drosophila Reaction not inferred B + ATP A No orthologue - Protein not inferred
Zoom/move toolbar Thumbnail Species selector The Pathway Browser Diagram Key Sidebar Pathway Diagram Panel Details Panel (hidden)
Pathway Analysis – Overrepresentation ‘Top-level’ Reveal next level P-val
Species Comparison II Yellow = human/rat Blue = human only Grey = not relevant Black = Complex
‘Hot’ = high ‘Cold’ = low Expression Analysis II Step through Data columns
Summary Network and pathway analysis enable the researcher to: • Identify clusters of proteins – these may share the same function (stable complex), process or subcellular location • Identify proteins involved in the same pathway i.e. in the same process (only works for those proteins which can be placed in pathways) • Add biological meaning to a list of gene/transcript/protein identifiers.
http://www.ebi.ac.uk/training/online/ Interactions, Pathways and Networks Analyzing protein-protein interaction networks. Koh GC , Porras P , Aranda B , Hermjakob H , Orchard SE PMID:22385417 J Proteome Res [2012 (11) ] page info:2014-31
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
Current IntAct support: European Commission grants PSIMEx (FP7-HEALTH-2007-223411) APO-SYS (FP7-HEALTH-2007-200767) Affinomics (241481) The development of Reactome is supported by a grant from the US National Institutes of Health (P41 HG003751), EU grant LSHG-CT-2005-518254 "ENFIN", Ontario Research Fund, and the EBI Industry Programme.