1 / 45

Advantages of SPC

SUPERVISED METHODS CAN ONLY VALIDATE OR REJECT HYPOTHESES. CAN NOT LEAD TO DISCOVERY OF ... Bax, IGF-BP3, Fas, killer/DR5, Noxa, PIG3, p53AIP1, PIDD, Puma ...

Kelvin_Ajay
Download Presentation

Advantages of SPC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basic methodologies1 . UNSUPERVISED: EXPLORATORY ANALYSIS • NO PRIOR KNOWLEDGE IS USED • EXPLORE STRUCTURE OF DATA ON THE BASIS OF • CORRELATIONS AND SIMILARITIES BASIC METHODOLOGIES OF ANALYSIS: SUPERVISED ANALYSIS: HYPOTHESIS TESTING USING CLINICAL INFORMATION (MLL VS NO TRANS.) IDENTIFY DIFFERENTIATING GENES SUPERVISED METHODS CAN ONLY VALIDATE OR REJECT HYPOTHESES. CAN NOT LEAD TO DISCOVERY OF UNEXPECTED PARTITIONS

  2. Advantages of SPC • Scans all resolutions (T) • Robust against noise and initialization -calculates collective correlations. • Identifies “natural” () and stable clusters (T) • No need to pre-specify number of clusters • Clusters can be any shape • Can use distance matrix as input (vs coordinates)

  3. stability T larger T - tighter, more stable cluster

  4. P53 p53 IS A CENTRAL PLAYER IN APOPTOSIS AND IN CELL CYCLE CONTROL. IT IS A TRANSCRIPTION FACTOR.

  5. PRIMARY TARGETS OF P53 K. Kannan, D. Givol, G. Rechavi,... G. Getz, I. Kela, Oncogene 2001 TEMPERATURE SENSITIVE MUTANT P53, ACTIVATE - 32 C (t=0) MEASURE EXPRESSION AT t=0,2,6,12,24 h (use t=0 as control) REPEAT IN PRESENCE OF CYCLOHEXIMIDE (CHX)t=0,2,4,6,9,12 (CHX INHIBITS PROTEIN SYNTHESIS) IDENTIFY UPREGULATED GENES USING FILTER: AT LEAST 2.5 FOLD INCREASE AT 3 OR MORE TIME POINTS (SEPARATELY IN EACH OF THE TWO EXPTS, -CHX AND +CHX) 38 CANDIDATE PRIMARIES: EFFECT OF FILTERING??? RELEASE FILTER FROM +CHX CLUSTERING: 3847 (31)

  6. REDUCE EFFECT OF FILTERING BY CLUSTERING c a %candidate primary targets K.Kannan et al, Oncogene X – 38 candidate primary targets

  7. INHIBITION OF P53-INDUCED APOPTOSIS BY IL-6 Lotem…Rechavi, D. Givol, L. Sachs PNAS 2003 BY REDUCING TEMPERATURE TO 32 DEGREES, P53 ASSUMES WILD-TYPE CONFORMATION, IS ACTIVATED AND INDUCES APOPTOSIS ADDING THE CYTOKINE IL-6 INHIBITS THE APOPTOTIC PROCESS QUESTION: WHERE DOES IL-6 INTERFERE IN THE CASCADE INITIATED BY P53? AT TOP?AT BOTTOM?

  8. IL-6 ?? Apoptosis IL-6 ?? QUESTION: WHERE DOES IL-6 INTERFERE IN THE CASCADE INITIATED BY P53? AT TOP?AT BOTTOM? Activated p53 Transactivation Other activities (C terminal = TFIIH binding?) (N terminal = SH3 binding?) p21/ Waf1 Bax, IGF-BP3, Fas, killer/DR5, Noxa, PIG3, p53AIP1, PIDD, Puma Other genes etc, etc, etc ?? Caspese cascade Growth arrest

  9. 333 GENES UPREGULATED BY P53 – NOT AFFECTED BY IL-6 309 GENES DOWNREGULATED BY P53 ALSO NOT AFFECTED

  10. IL-6 ?? Apoptosis IL-6 ?? QUESTION: WHERE DOES IL-6 INTERFERE IN THE CASCADE INITIATED BY P53? AT TOP?AT BOTTOM? ANSWER: AT BOTTOM!! Activated p53 Transactivation Other activities (C terminal = TFIIH binding?) (N terminal = SH3 binding?) p21/ Waf1 Bax, IGF-BP3, Fas, killer/DR5, Noxa, PIG3, p53AIP1, PIDD, Puma Other genes etc, etc, etc ?? Caspese cascade Growth arrest

  11. Colon Cancer Data COLON CANCER DATA Alon,Barkai, Notterman, Gish, Ybarra, Mack, Levine: PNAS 96, 6745 (1999) AFFYMETRIX; 40 TUMOR, 22 NORMAL TISSUES 2000 (OUT OF 6500) GENES OF HIGHEST INTENSITY Aij = EXPRESSION LEVEL OF GENE i IN TISSUE j

  12. Colon Cancer Data COLON CANCER DATA:

  13. Two-way clustering S1(G1) G1(S1) TWO-WAY CLUSTERING:

  14. Two way clustering-ordered TWO-WAY CLUSTERING: S1(G1) G1(S1)

  15. 2-way clustering - tissues TWO-WAYCLUSTERING – TISSUES 1. IDENTIFY TISSUE CLASSES (TUMOR/NORMAL)

  16. 2-way clustering –genes Erel Ribosomal proteins Cytochrome C metabolism HLA2 TWO-WAY CUSTERING – GENES - G1(S1) 2.FIND DIFFERENTIATING AND CORRELATED GENES EACH GENE = POINT IN 62 DIMENSIONAL SPACE

  17. Two-way clustering TWO-WAY CLUSTERING: Can one improve?

  18. football

  19. C2WC - Motivation COUPLED TWO-WAY CLUSTERING MOTIVATION: ONLY A SMALL SUBSET OF GENES PLAY A ROLE IN A PARTICULAR BIOLOGICAL PROCESS; THE OTHER GENES INTRODUCE NOISE, WHICH MAY MASK THE SIGNAL OF THE IMPORTANT PLAYERS. ONLY A SUBSET OF SAMPLES EXHIBIT THE EXPRESSION PATTERNS OF INTEREST. SHOULD USE A SUBSET OF GENES TO STUDY A SUBSET OF THE SAMPLES (AND VICE VERSA) PROBLEM: ENORMOUS NUMBER OF SUBMATRICES

  20. C2WC - method COUPLED TWO-WAY CLUSTERING PICK ONE STABLEGENE CLUSTER. REPRESENT TISSUES BY THE EXPRESSION LEVELS OF THESE GENESONLY. ANALYZE ALL TISSUE CLUSTERS BY USING ALL GENE CLUSTERS, ONE AT A TIME. LOOK FOR INTERNAL STRUCTURE, SUB-CLUSTERS. USE ALL STABLE TISSUE CLUSTERS TO CLASSIFY GENES; IDENTIFY GENE CLUSTERS THAT GOVERN BIOLOGICAL PROCESSES. ITERATE THE PROCEDURE UNTIL NO NEW STABLE CLUSTERS EMERGE

  21. tissues 1 G4 G12 COUPLED TWO-WAY CLUSTERING OF COLON CANCER: TISSUES S1(G4) S1(G12)

  22. CTWC colon cancer - tissues Tumor Normal S17 Protocol A Protocol B COUPLED TWO-WAY CLUSTERING OF COLON CANCER: TISSUES S1(G4) S1(G12)

  23. genes1 G1(S17) S17

  24. CTWC of colon cancer - genes G1(S17) COUPLED TWO WAY CLUSTERING OF COLON CANCER - GENES USING ONLY THE TUMOR TISSUES TO CLUSTER GENES, REVEALS CORRELATION BETWEEN TWO GENE CLUSTERS; CELL GROWTH AND EPTHELIAL G1(S1) COLON CANCER - ASSOCIATED WITH EPITHELIAL CELLS

  25. glioblastoma 174 genes separate (at FDR of 5%) PrGBM from LGA + ScGBM S Godard, G Getz, H Kobayashi, P Farmer, M Delorenzi, M Nozaki, A-C Diserens, M-F Hamou, P-Y Dietrich, J-G Villemure, R C. Janzer, P Bucher, R Stupp, N de Tribolet, E Domany, M E. Hegi GLIOBLASTOMA: CLONTECH ARRAYS 1185 Genes, 36 Samples 12 Astrocytoma(II) 4 secondary GBM 17 Primary GlioBlastoMa 3 Cell Lines

  26. glioblastoma FILTERING  358 HIGHLY VARYING GENES GLIOBLASTOMA: S3 S1(G1) Coupled Two-Way Clustering (CTWC) of 358 Genes and 36 Samples S2 T G12 GENES G5 Astrocytoma(II) Secondary GBM Primary GlioBlastoMa Cell Lines G1(S1)

  27. S1(G5) Super-Paramagnetic Clustering of All Samples Using Stable Gene Cluster G5 S1(G5) S14 S13 S12 S11 S10 Fig. 2B

  28. validation G5Ver

  29. THE GENES OF G5 THE GENES OF G5: AB004904 STAT-induced STAT inhibitor 3 M32977 VEGF M35410 IGFBP2 X51602 VEGFR1 M96322 gravin AB004903 STAT-induced STAT inhibitor 2 PTN X52946 J04111 c-jun X79067 TIS11B VEGF AND ITS RECEPTORS – INSTRUMENTAL IN ANGIOGENESIS; INDUCED GROWTH OF BLOOD VESSELS, ESSENTIAL FOR GROWTH BEYOND A CRITICAL SIZE. THE COEXPRESSION OF IGFBP2 WAS INDEPENDENTLY VERIFIED; 1ST EVIDENCE FOR POSSIBLE ROLE IN ANGIOGENESIS.

  30. Fig 6

  31. Analysis of cervical cancer data ‘g’ - good ‘b’ - bad ‘o’ - other S 02 - 1 e g ‘S’ - sample ‘C’ - cell line Batch #1,2,3 ‘a’ - adeno ‘e’ - epidermal ‘n’ - normal Sample number C. Rosty, F. Radvanyi, N. Stransky …M. Sheffer, D. Tsafrir, I. Tsafrir …X. Sastre, Oncogene (2005) Total of 45 samples/chips: • 5 Cell lines. • 5 Normal samples. • 35 tumor samples, 5 of which are repeats. • 10 adenocarcinoma tumors: 4 are HPV-16 and 6 are HPV-18. • 20 epidermal carcinoma: 12 HPV-16, 6 HPV-18, 1 HPV-33 and 1 HPV-99. MAIN AIM: PREDICT OUTCOME AT DISCOVERY

  32. AIM: IDENTIFY GENES WHOSE EXPRESSION LEVEL, MEASURED AT THE TIME OF DISCOVERY OF THE MALIGNANCY, IS INDICATIVE OF OUTCOME

  33. WE USED STANDARD STATISTICAL TESTS LOOKING FOR GENES WHOSE EXPRESSION LEVELS SEPARATE PATIENTS WITH GOOD OUTCOME FROM PATIENTS WITH BAD OUTCOME. NO SUCH GENES WERE FOUND PERHAPS TRY UNSUPERVISED METHODS (E.G. CLUSTERING) ???

  34. S1(G1) G1(S1) Two-way Clusteringof cervical data Two clustering operations: • 35 samples based on the expression of 5000 probes; S1(G1) • 5000 probes in 35 dimensional space; G1(S1)

  35. S1(G7) G7 G3 S1(G10) G10 S1(G3) Coupled Two-Way Clustering of Cervix Cancer 35 SAMPLES (REMOVE CELL LINES AND REPLICATES) 5000 GENES (PASSED VARIANCE FILTER) FOCUS ON G3: CLUSTER OF 148 GENES (163 probe sets)

  36. “good” normal cell lines S1(G7) G7 G3 S1(G10) G10 S1(G3) Coupled Two-Way Clustering of Cervical Cancer Getz et al PNAS 2000 FOCUS ON G3 (PROLIFERATION CLUSTER, GO): 1. Cluster samples using 163 probe sets; 2. SORT (using SPIN )

  37. S19-1noo S28-1noo S07-1noo S35-1noo S02-1noo S29-3a6g S26-2a8+ S20-2e8g S03-1e8g S34-2e8g S23-1a8g S13-1a8b S31-3a6g S08-1e6g S23-2a8g S10-1e6b S18-1e8b S04-1a8b S12-1e8b S05-1a8b S11-1e3b S25-3a6g S22-1e6g S27-1e6b S32-2e6g S17-3a6o S33-1e6b S15-2e6g S09-1e6b S18-2e8b S06-1e6+ S14-1e6b S33-2e6b S15-1e6g S21-2a8o S24-1e6b S01-1e6g S14-3e6b S30-2e8b C01-3c8o S16-1e9o C06-3c8o C07-3c6o C03-3c6o C05-3c8o 163 probes ‘Good outcome’ sample cluster(AACR 2004) Low expression level of the “Proliferation Cluster”indicates good outcome High expression: no prediction Normal samples Good outcome Cell-line samples Validated by RT-PCR of 20 genes over 70 samples

  38. P53 and Rb control (restrain) proliferation (inactivating E2F) Activity of P53 and Rb is controlled by E6/E7 Viral Protein Content. E6/E7 Protein Concentration controlled by E6/E7 RNA Expression Level use TF binding site sequence information to derive network E7 RNA: Corr=0.54,0.62 Ordered Expression Matrix of 20 proliferation Genes HPV16/HPV18 E7 DNA: Corr=0.34,0.55 E6/E7 RNA Level controlled by E6/E7 DNA COPY NUMBER

  39. AIM: IDENTIFY GENES WHOSE EXPRESSION LEVEL, MEASURED AT THE TIME OF DISCOVERY OF THE MALIGNANCY, IS INDICATIVE OF OUTCOME FINDING: A CUSTER OF 150 GENES, ASSOCIATED WITH CELL PROLIFERATION, HAS RELATIVELY LOW EXPRESSION LEVELS IN A SUBSET OF THE “GOOD OUTCOME” PATIENTS. VALIDATION (PCR) FINDING: CELL PROLIFERATION EXPRESSION LEVEL IS CONTROLLED BY AMOUNT OF VIRAL PROTEINS E6, E7, WHICH IS GOVERNED BY NUMBER OF DNA COPIES THAT WERE INSERTED BY THE VIRUS Rosty et al, Oncogene 2005

  40. signature algorithm J. Ihmels, G. Friedlander,S. Bergmann,O. Sarig, Y Ziv, N. Barkai

  41. recurrence yeast genome: 6400 genes, 1000 “conditions” (chips) ( • Ncore = 37,73,145 genes for ribosomal proteins • 132 genes for biosynthesis • Each used as input GIref, returns (nearly same) gene signature Sref • add Nrandrandomly picked genes • GIinput set of Ncore + Nrandgenes, returns gene signatures SI • Recurrence of Sref is measured by • Overlap = Fraction of shared genes by Sref and SI • (b) Use as GIrefsets of genes with shared regulatory sequences. • Only the truely coregulated ones are returned in Sref; recurrent.

  42. pathways • Tricarboxyl acid (TCA) cycle: known genes in E.coli, • find (34) homologues in yeast used as GI ; produce SIwhich • excludes the wrong genes and misses only few correct ones • (b,c) Identify two autonomous subparts of the cycle

More Related