200 likes | 634 Views
Regulated process or chance observation?. Divergent transcription. Joern Toedling Huber group November 2007. Divergent transcription. Transcript orientation. Data Collection. S90 samples, treated with Actinomycin D (ActD+); grown on 3 * YPD 3 * YP+Galactose 3 * YPE: YP+Ethanol
E N D
Regulated process or chance observation? Divergent transcription Joern Toedling Huber group November 2007
Data Collection S90 samples, treated with Actinomycin D (ActD+); grown on 3 * YPD 3 * YP+Galactose 3 * YPE: YP+Ethanol S96 cell cycle ActD+: 45 samples, one taken each 5 min for 220 min (~ 2 iterations of cell cycle)
Exponential growth data 9 samples, normalized by use of a genomic DNA sample, grown under 3 different growth conditions, one common segmentation and post-processing, results in 5510 transcripts: 1360 pairs of divergent transcripts 672 pairs of convergent transcripts 781 pairs of tandem transcripts
Cell-cycle data 45 samples, normalized by use of a genomic DNA sample, one common segmentation and post-processing, results in 5596 transcripts: 1266 pairs of divergent transcripts 829 pairs of convergent transcripts 847 pairs of tandem transcripts
are divergent transcripts' levels correlated across cell cycle and is this correlation dependant on the closeness of the transcript boundaries
Which correlation coefficient? Pearson CC cov(X,Y) ρX,Y = sd(X) * sd(Y) Spearman CC is based on ranks(X) and ranks(Y) instead of X and Y
Interlude: Interpreting Pearson CC = 0.8 Anscombe, Francis J. (1973) Graphs in statistical analysis. American Statistician, 27, 17–21.
Spearman correlation of divergent transcripts' levels during cell cycle
Spearman correlation of divergent transcripts' levels across 3 conditions w/ 3 samples each
Does transcript orientation have a meaning according to Gene Ontology? • Only 18% of divergently transcribed ORFs share common GO Terms (22% for convergently transcribed ORFs) • mostly very common ones (nucleus, cytoplasm) • divergently transcribed ORFs share “nucleus” a bit less often than convergently transcribed ones
Interlude: GO evidence codes • IMP: inferred from mutant phenotype • IGI: inferred from genetic interaction • IPI: inferred from physical interaction • ISS: inferred from sequence similarity • IDA: inferred from direct assay • IEP: inferred from expression pattern • IEA: inferred from electronic annotation • TAS: traceable author statement • NAS: non-traceable author statement • ND: no biological data available • IC: inferred by curator
Is transcript “closeness” meaningful? • 12,162,996 bp total size of yeast strain S288c genome • 9,043,789 bp summed up length of transcribed regions • 558 bp average space between transcripts • yeast genome is very compact ! • maybe restrict to very close divergent transcripts (< 300bp ?)
To do... investigate ActD+ Cdc28 cell-cycle data TF binding motif enrichment in found bidirectional promoters binding of known motifs (annotated in Transfac and SCPD data bases) discovery of new motifs, using tools such as MEME consider restriction to those pairs of divergent transcripts whose expression is correlated across the cell-cycle to distinguish functional divergent pairs from those that just appear to be divergent due to compactness of yeast genome describe functional bidirectional promoters in yeast describe filtered novel transcripts from bidirectional promoters
Leisure activities... look for new and known TF binding motifs upstream of periodic anti-sense transcripts: