220 likes | 357 Views
Genome-wide Regulatory Complexity in Yeast Promoters. Zhu YANG 15 th Mar, 2006. Reference. C. S. Chin, J. H. Chuang, & H. Li. 2005. Genome-wide regulatory complexity in yeast promoters: Separation of functionally conserved and neutral sequence . Genome Research. 15(2):205-13. Outline.
E N D
Genome-wide Regulatory Complexity in Yeast Promoters Zhu YANG 15th Mar, 2006
Reference • C. S. Chin, J. H. Chuang, & H. Li. 2005. Genome-wide regulatory complexity in yeast promoters: Separation of functionally conserved and neutral sequence. Genome Research. 15(2):205-13.
Outline • Purposes • Methods • Results • Discussion
Purposes • To separate functionally conserved and neutral sequence. • To know how much promoter sequence is functional.
Methods • Determine the local neutral mutation rates by measuring the degree of sequence conservation across the genome • Determine what parts of yeast promoters evolve neutrally • Estimate the total amount of promoter sequence under selection in promoters. • Find out how much regulation acts on each gene roughly by analyzing the length of sequence in high conservation regions for each promoter.
Algorithms • Calculation of substitution rates from fourfold sites • Mutational uniformity • Separation of high and low conserved regions with a hidden Markov model • Genome-wide percentage of promoter sites under selection • z-score in Gene Ontology analysis
Neutral mutation rates are uniform genome-wide • Mutation rates are uncorrelated along the yeast genome • In contrast, mouse-human conservation rates are significantly correlated along the human genome at separations up to several megabases
Neutral mutation rates are uniform genome-wide (Cont’d) • There is a subset of genes was biased toward high conservation by some secondary effect • There are 92% of the genes mutate neutrally at fourfold degenerate sites. The high conservation values for the remaining 8% of the genes were explainable by codon usage selection • correlation of the normalized substitution rate with codon adaptation index (CAI) was 0.67.
Neutral conservation rates in promoters • Functional elements should be separated from the neutral background, since conservation can be due to shared ancestry. • Hidden Markov model (HMM) • Break the promoters into high conservation regions (HCR) and low conservation regions (LCR). • the HCRs and LCRs gave a good approximation to functional and neutral regions.
Neutral conservation rates in promoters (Cont’d) • The HCRs, on the other hand, contained an excess of functional elements. • While the HCRs covered only 34.3% of the promoter regions, they contained 71.6% motifs in the promoters. • The neutral rates in the LCRs were consistent with the neutral rates obtained from the fourfold site analysis
Distribution of the conservation rate for promoter sequences
Genome-wide amount of promoter sequence under selection • Frequency of Conserved Blocks (FCB) method was more robust than the HMM for inferring the amount of selectively conserved sequence • Count the numbers of blocks of n consecutive conserved bases in the promoter sequences, which were then compared to neutral expectations.
Requirements • The frequency distribution of conserved blocks in neutral sequence is known • This neutral component can be extracted from the real frequency distribution.
Distribution of the counts of blocks of n consecutive conservedbases
Estimate of the percentage of sites evolving neutrally among various species
Gene-specific selection in promoters • The HCRs provide a rough characterization of the transcriptional regulation in each promoter. • most genes having 15%–25% of their promoter sequence in HCRs. • Protein sequence conservation was correlated on a gene-by-gene basis with HCR length
The Gene Ontology terms • With the largest HCR length biases were those involved in the energy generation and steroid synthesis pathways, suggesting that these types of genes have unusually complex regulation. • The genes with the strongest protein sequence conservation were not always those having the longest HCR lengths, Catalysis, Basic Biosynthesis, and Ribosomal Genes, for example.
Discussion • The neutral conservation rate is uniform across yeast genomes. One nonselective possibility is that yeast chromosomes are too short to have heterogeneity in their mutational environment • A significant fraction of promoter sequence was under purifying selection. • A typical function block may contain one or two protein-binding sites; an upper bound of ∼10 transcription-factor-binding sites in a promoter. • Genes involved in energy generation and steroid synthesis may be subject to complex transcriptional regulation.