310 likes | 558 Views
Comparative genomics analysis of NtcA regulons in cyanobacteria: Regulation of nitrogen assimilation and its coupling to photosynthesis. Wen-Ting Huang Jau-Chi Huang. Zhengchang Su, Vitctor Olman, Fenglou Mao and Ying Xu. Outline. Introduction Method Result Conclusion. Introduction.
E N D
Comparative genomics analysis of NtcA regulons in cyanobacteria: Regulation of nitrogen assimilation and its coupling to photosynthesis Wen-Ting Huang Jau-Chi Huang Zhengchang Su, Vitctor Olman, Fenglou Mao and Ying Xu
Outline • Introduction • Method • Result • Conclusion
Introduction • DNA transcription • mRNA 5’ 3’ 5’ 3’ coding region coding region intron
5’ 3’ 5’ 3’ Regulation elements RNA polymerase 5’ 3’ Regulatory region RNA polymerase binding site upstream downstream transcription direction
cis-regulation • Regulatory regions of genes and the regulated genes are on the same chromosome • Phylogenetic footprinting • Identifies regulatory elements by finding regions in a set of orthologous non-coding DNA sequences from multiple species.
Cyanobacteria • bacteria • live in the water • Gram-Negative, oxygenic phototrophs • Nitrogen control in cyanobacteria is mediated by NtcA http://www.ucmp.berkeley.edu/bacteria/cyanointro.html
NtcA • A protein which regulates the assimilation of nitrogen. • NtcA binding site • Base Motif “GTAN8TAC” • ~14 bps • Intron • Nitrogen fixation related genes • -31 downstream has -10 σ70like box “TAN3T”
High false positive rate • Too short to identify • 3 methods: • Coding region • -10 like box • Othologous genes
Materials • Nine sequenced cyanobacteria genomes were downloaded from the GenBank. • ftp.ncbi.nih.gov/genomes/Bacteria/
Method • Step 1: • Prepare training sets • Get the profiles(GTAN8TAC, TAN3T) • Step 2: • Scan genomic sequences and score each motif. • Step 3: • Decide the cutoff.
Known • Possible NtcA binding sites (GTAN8TAC) • Appear in the upstream intergenic regions • In many cases, there is a –10 like box (TAN3T) in the 31bp downstream regions of the NtcA binding site upstream transcription unit transcription unit 31bp
Prepare training sets • They chose 11 genes which are known to be regulated by NtcA from the nine cyanobacterial genomes. • They used phylogenetic footprinting and identified 51 putative NtcA binding sites. • These 51 sites constitute the training set A1 for the NtcA binding site. • The –31 bp downstream regions are further searched for a –10 like box and form the training set B1
A2&B2 • They collected 12 experimentally verified NtcA binding sites and their downstream from seven other cyanobacteria. • They also included the sites that failed to find by phylogenetic footprinting.
Profiles • They combined A1 and A2 to construct the profile of NctA binding sites.
Profiles • They combined B1 and B2 to construct the profile of –10 like boxes.
Scan genomic sequences upstream transcription unit transcription unit GTAAAGTTAAGTTCCTTCAAAGCATTCGTGG TTAAAGTTAAGTTCTTTTAAAGCTTTCGTGG
Scan genomic sequences upstream transcription unit transcription unit GTAAAGTTAAGTTCCTTCAAAGCATTCGTGG TTAAAGTTAAGTTCTTTTAAAGCTTTCGTGG
Orthologous genes • The presence of similar motifs in the regulatory regions of the orthologous genes can increase the prediction accuracy. • They predicted two genes in two genomes to be orthologous to each other if they are a pair of reciprocal best hit in BLASTP searches.
Orthologous genes upstream transcription unit
Cutoff • The largest score for the genome to include all the binding sites from that genome in the training sets. • P-value • p[S(CU)>sc]<0.01 or 0.05
Analysis “GTA________TAC” “TA___T”
Niche of NtcA in cyanobacteria … ? • Some genes bear NtcA promoters might coordinate photosynthesis and nitrogen fixation. • RNA polymerase σ-factor in cyanobacteria might bear an NtcA promoter and regulated by NtcA.
Conclusion • The false positive rate is reduced from 8.2 to 90.9 fold. • Some binding sites might be missed due to the lack of orthologues in the other genomes. • NtcA promoters are found for many genes involved in the various stages of photosynthesis process.