1 / 35

Modeling Functional Genomics Datasets CVM8890-101

Modeling Functional Genomics Datasets CVM8890-101. Lesson 6 11 July 2007 Bindu Nanduri. Lesson 6: Functional genomics modeling II: a pathway analysis example. Introduction to protein interaction networks. Cancer. Programmed Cell Death. Proliferation. Cell. Quiescence. Differentiation.

varian
Download Presentation

Modeling Functional Genomics Datasets CVM8890-101

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling Functional Genomics DatasetsCVM8890-101 Lesson 6 11 July 2007 Bindu Nanduri

  2. Lesson 6: Functional genomics modeling II: a pathway analysis example.

  3. Introduction to protein interaction networks

  4. Cancer Programmed Cell Death Proliferation Cell Quiescence Differentiation Differentiation

  5. Lymphoma Anergy Programmed Cell Death Activation CD4 + T ‘helper” Lymphocyte Quiescence Proliferation Differentiation

  6. GOanna GOSlimViewer Agbase protein annotation process Protein identifiers or Fasta format GORetriever Annotated Proteins Proteins with no annotations

  7. Angiogenesis Anergy Apoptosis Senescence Proliferation Migration Differentiation Quiescence Potential CD4+ T lymphocyte Biological Processes Activation 56% 100% 20% 31% 80% 44% 69% Cell Cycle 79% 21% 33% 67% 67% 92% 8% 92% 8% 32% 68% 33%

  8. AP-1 dependent gene expression Tumor invasion Metastasis Integrin Signaling Pathway AP-1

  9. Hypothesis driven data analysis Exploration of data to identify pathways of interacting proteins Protein protein interaction networks (PPI)

  10. Why study PPIs Proteins do not function alone!!!!! PPI are inherent to the function of multiprotein complexes PPIs can help infer function : where functional information is available for one partner Changes in normal PPI can result in disease

  11. Types of PPI

  12. PPI categories based on composition, affinity and timescale of interaction Homo and hetero oligomeric complexes: interactions between identical or non-identical chains Obligate PPI: protomers do not exist in as stable structures in vivo these are functionally obligate Non-obligate PPI: protomers can exist as stable structures, may co-localize for function /are co-localized c Arc repressor dimer necessary for DNA binding Non-obligate homo dimer Sperm lysin

  13. PPI based on the life time of the complex: transient or permanent Permanaent interactions are stable and exist only as complex Transient interactions are marked by association/dissociation cycles in vivo Weak interactions (sperm lysin) associate and dissociate Strong transient interactions require a molecular trigger heterotrimeric G protein dissociates to G-alpha and g-beta and g-gamma when it binds to GTP , GDP-bound form is a trimer

  14. Control of protein oligomerization PPI interactions are a continuum of obligate and non-obligate states Interactions of complexes driven by concentration and free energy of complex relative to alternate states

  15. Take home message of PPI types PPI interactions are a continuum of obligate and non-obligate states Interactions of complexes driven by concentration and free energy of complex relative to alternate states

  16. How to identify PPI Computational Experimental Phylogenetic profile Yeast two hybrid Yeast two hybrid (Y2H) Gene Cluster TAP assays TAP assays Sequence coevolution Gene Coexpression Rosetta stone method Protein arrays Text mining

  17. Y2H Assay Eukaryotic transcription factors have DNA binding and activation domain Physical association of these domains activates transcription Cretae chimeric proteins with either BD or AD tranfect yeast Gal4/LexA based reporters In vivo method that can detect transient PPI PLoS Computational Biology March 2007, Volume 3 e42

  18. TAP Assay TAP tag consists of two IgG binding domains of Staphylococcus protein A and calmodulin binding peptide seperated by tobacco etch virus protease cleavage site TAP provides direct information on protein complexes O. Puig et al,Methods, 2001

  19. Gene Coexpression Expression profile similarity correlation coefficient between relative expression levels of two genes/proteins the normalized difference between their absolute expression levels The distribution for target proteins is compared with the distributions for random noninteracting protein pairs Expression levels of physically interacting proteins coevolve coevolution of gene expression is a better predictor of protein interactions than coevolution of amino acid sequences Good for studying permanent complexes : ribosome, proteasome PLoS Computational Biology March 2007, Volume 3 e42

  20. Protein microarrays/chips Protein chips are disposable arrays of microwells in silicone elastomer sheets placed on top of microscope slides Target proteins are over expressed immobilized and probed with fluorescently labeled proteins H Zhu et al (2000) “Analysis of yeast protein kinases using protein chips” Nature Genetics 26: 283-289 can detect PPI between actual proteins PLoS Computational Biology March 2007, Volume 3 e42

  21. Database/URL/FTPType DIPhttp://dip.doe-mbi.ucla.edu E,S BINDhttp://bind.ca E,C,S MPact/MIPShttp://mips.gsf.de/services/ppi E,C,F STRINGhttp://string.embl.de E,P,F MINThttp://mint.bio.uniroma2.it/mint E,C IntActhttp://www.ebi.ac.uk/intact E,C BioGRIDhttp://www.thebiogrid.org E,C HPRDhttp://www.hprd.org E,C ProtComhttp://www.ces.clemson.edu/compbio/ProtCom S,H 3did, Interpretshttp://gatealoy.pcb.ub.es/3did/ S,H Pibase, Modbasehttp://alto.compbio.ucsf.edu/pibase S,H CBMftp://ftp.ncbi.nlm.nih.gov/pub/cbm S SCOPPIhttp://www.scoppi.org/ S iPfamhttp://www.sanger.ac.uk/Software/Pfam/iPfam S InterDomhttp://interdom.lit.org.sg P DIMAhttp://mips.gsf.de/genre/proj/dima/index.html F,S Prolinkshttp://prolinks.doe-mbi.ucla.edu/cgibin/functionator/pronav/ F Predictomehttp://predictome.bu.edu/ F PLoS Computational Biology March 2007, Volume 3 e42

  22. Database/URL/FTPType DIPhttp://dip.doe-mbi.ucla.edu E,S BINDhttp://bind.ca E,C,S MPact/MIPShttp://mips.gsf.de/services/ppi E,C,F STRINGhttp://string.embl.de E,P,F Type of data (high-throughput experimental data (E), structural data (S), manual curation(C), functional predictions (F), and interface homology modeling (H) Unit of interaction :P is protein IntActhttp://www.ebi.ac.uk/intact E,C BioGRIDhttp://www.thebiogrid.org E,C HPRDhttp://www.hprd.org E,C ProtComhttp://www.ces.clemson.edu/compbio/ProtCom S,H 3did, Interpretshttp://gatealoy.pcb.ub.es/3did/ S,H Pibase, Modbasehttp://alto.compbio.ucsf.edu/pibase S,H CBMftp://ftp.ncbi.nlm.nih.gov/pub/cbm S PLoS Computational Biology March 2007, Volume 3 e42

  23. PPI database comparisons Proteins: Structure, Function and Bioinformatics 63:490-500 2006

  24. Experimental PPI dataset overlap is small High FP rate in high- throughput exp …….difficult to confirm by multiple sources

  25. How to identify PPI Computational Experimental Phylogenetic profile Yeast two hybrid Yeast two hybrid (Y2H) Gene Cluster/neighborhood TAP assays TAP assays Sequence coevolution Gene Coexpression Rosetta stone method Protein arrays Text mining

  26. Phylogenetic profile (PP) Hypothesis: functionally linked and potentially interacting nonhomologous proteins co-evolve and have orthologs in the same subset of fully sequenced organisms PLoS Computational Biology March 2007, Volume 3 e43

  27. Gene Cluster, Gene Neighborhood Genes in the gene cluster/operon are co-regulated and participate in the same biological function PLoS Computational Biology March 2007, Volume 3 e43

  28. Sequence Co-evolution interacting proteins very often co-evolve changes in one protein ( loss of function or Interaction) compensated by the correlated changes in another protein. The orthologs of co-evolving proteins tend to interact, thereby making it possible to infer unknowninteractions in other genomes co-evolution can be reflected in terms of the similarity between phylogenetic trees of two non-homologous interacting protein families PLoS Computational Biology March 2007, Volume 3 e43

  29. Rosetta Stone method interacting proteins/domains have homologs in other genomes fused into one protein chain, a Rosetta Stone protein Gene fusion occurs to optimize co-expression of genes encoding for interacting proteins. PLoS Computational Biology March 2007, Volume 3 e43

  30. Text Mining Utilizing the wealth of publicly available data ..search Medline or PubMed for words or word combinations co-occurrence of words together is a simple metric, however prone to high false positive rates Natural Language Processing (NLP) methods are specific “A binds to B”; “A interacts with B”; “A associates with B” difficult to detect so it has a higher false negative rate Normally requires a list of known gene names or protein names for a given organism

  31. GO ToolBox Genome Biol. 2004;5(12):R101.

  32. ProtQuant tool

More Related