220 likes | 435 Views
Identification of compounds to affect radiosensitivity of cells. Pellegrini Lab—UCLA SoCalBSI 2007 Joshua Smith Bazyl Nettles. Outline. Biological Significance Overall Objectives Basic Methodology Tools Background Experimental Approach. Biological Significance.
E N D
Identification of compounds to affect radiosensitivity of cells Pellegrini Lab—UCLA SoCalBSI 2007 Joshua Smith Bazyl Nettles
Outline • Biological Significance • Overall Objectives • Basic Methodology • Tools • Background • Experimental Approach
Biological Significance • Results from our project could be used in development of drugs to affect cells’ radiosensitivity • Decreased radiosensitivity possibly beneficial to people that have been exposed to radiation • Increased radiosensitivity beneficial to potentially increase effectiveness of radiotherapy (cancer treatment)
Project Objectives • From gene expression information from cells exposed to 167 bioactive compounds: • Identify transcription factors that are activated in response to drugs • Identify which compounds activate the same factors as those activated by exposure to radiation
Basic Methodology • Changes in gene expression are regulated by the binding of transcription factors to promoters • The activity of a transcription factor often depends on co-factors and post translational modification and cannot therefore be reliably estimated from mRNA levels of the factor • Transcriptional regulation is inherently combinatorial
Basic Methodology • Thus, we use multivariate regression to estimate transcription factor activities
Major Tools • Matlab 2007 • Bioinformatics Toolkit • MS Excel • Perl www.mathworks.com www.microsoft.com www.perl.org
Background • Data taken from “The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease” by Lamb, et al • “…we have created the first installment of a reference collection of gene-expression profiles from cultured human cells treated with bioactive small molecules…” • “Connectivity Map” can be used to find connections among small molecules, expression, genes, etc.
Experimental Approach • Five basic steps: • Ordering and Gathering Data • Probe, Gene and Promoter Identification • Transcription Factor Data • Generate Models • Compare Model
^SAMPLE = GSM119282 !Sample_title = 5202764005789148112904.A10 !Sample_geo_accession = GSM119282 !Sample_status = Public on Sep 27 2006 … ID_REF VALUE ABS_CALL 1007_s_at 495.3 P 1053_at 278.2 P 117_at 3713.4 P 121_at 44.7 P 1255_g_at 2.6 A 1294_at 16 A 1316_at 5.2 A 1320_at 4.4 A 1405_i_at 16 A 1431_at 21.2 A 1438_at 7.6 A … GSM118720 453 Drug/Contro Ratios GSM118721 564 Samples (microarrays) … 22280 probes GSM119282 22280 probes GSM119282 http://www.ncbi.nlm.nih.gov/geo/ Ordering and Gathering Data • “Connectivity Map” data retrieved from NCBI’s Gene Expression Omnibus (167 compounds) • Using Matlab’s Bioinformatics Toolkit, imported 564 expression profiles • Using Matlab, MS Excel, and Perl divided data into 453 “experiments” • Using SQL, detected and averaged duplicate experiments, leaving us with 314 experiments The Connectivity Map The Connectivity Map (GSE5258)
Probe, Gene and Promoter Identification • Retrieved human promoters from UCSC Genome Browser • Retrieved microarray and probe information from GEO for our data • Found variance for each probe across 314 unique experiments • Using top 2000 by variance, revealed 1704 probe/gene/promoter sets
314 Unique Experiments Variance 22280 Probes Keep Top 2000 Withpromoterdata Probe Variance ------------------------ 3 5423.535 12799 3647.582 17745 550.7743 5991 253.0915 3192 250.4694 1704 genes& promoters Probe, Gene and Promoter Identification Normalized expression ratio
Promoter ATGCCCTTGCTATCTGCATGCTATCTGCACTGGACGT… Transcription Factor Data • TRANSFAC® and JASPAR® are the databases of transcription factors, their genomic binding sites and DNA-binding profiles. • For each TF PWM, we move along the promoter sequence calculating the probability of binding • The maximum binding probability calculated along a sliding window is kept for each promoter
~940 TFs from TRANSFAC & JASPAR 1704 gene promoters Transcription Factor Data • Then the maximum score for each promoter is compiled into a matrix probability a TF will bind to a particular promoters
Model Generation • Generate models using Multivariate Adaptive Regression Splines (MARS) to correlate • occurrences of TF binding motifs in the promoter DNA • their interactions to the gene expression levels • “Model” refers to set of Transcription factors that can explain a high percentage of the current variance in expression activity
Model Generation • An overabundance of data led to a predicted modeling time of 20 hrs for each of our 314 experiments • This led to a decision to reduce the number of TFs used for computation from all 940 to ~40 “relevant” TFs • This could be used as to identify likely experiments that could be run with all TFs
Relevant Factors • Ataxia telangiectasia mutated • Protein kinase that plays a critical role in response to certain types of DNA damage • Produced in all cells, it is activated once DNA damage has occurred. (Hawley and Friend, 1996; Banin et al., 1998; Canman et al., 1998) • Used list of ATM dependent factors that are activated in response to radiation damage (prior work) • Compare with models • Attempt to find experiments (compounds) activate the same factors as those activated by exposure to radiation
Results • We generated models for these relevant factors and found several experiments with a high reduction in variance (RIV) • RIV • The percentage of variance in expression accounted for by the factors in a model
Results RIV for our 314 experiments
Results • The top 10 drugs by RIV were: HC toxin Pirinixic acid Ionomycin Phenanthridinone Tioguanine Fasudil Prochlorperazine Amitriptyline 1 7. Valproic acid 10.Colchicine 2 1. Valproic acid enhances brain tumor cell radiosensitivity. Immunotherapy Weekly (2005-06-01) 2. Modification of Radiation Response of Tissue by Colchine. International Congress of Radiology (1965-09-27)
References • Debopriya Das, Nilanjana Banerjee, and Michael Q. Zhang. Interacting models of cooperative gene regulation. PNAS, 2004. • Justin Lamb, et al. The Connectivity Maps: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Diseases. Science, 2006. • Debopriya Das, Zaher Nahle, and Michael Q. Zhang. Adaptively inferring human transcriptional subnetworks. Molecular Systems Biology, 2006. • Shawn Cokus, et al. Modelling the network of cell cycle transcription factors in the yeast Saccharomyces cervisiae. BMC Bioinformatics, 2006.
Acknowledgements • UCLA and the Pellegrini Lab • Dr. Matteo Pellegrini • Dr. David Casero Díaz-Cano • SoCalBSI Instructors and Fellow Students • National Institutes of Health • National Science Foundation • LA / Orange County Biotechnology Center www.ucla.edu