180 likes | 294 Views
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. ECS289A Presentation By Hua Chen 2003-3-3. Background Knowledge.
E N D
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A Presentation By Hua Chen 2003-3-3
Background Knowledge • A significant character of cis-regulatory sites: the multiple binding sites for different transcriptional factors tend to cluster together in one region around the gene, forming the Cis-Regulatory Modules (CRM). • The searching of cis-regulatory sites gives out too many candidate positions, which make it difficult to tell the true ones; • The character of CRM provides a feasible method to identify the cis-regulatory sites in the genome.
Targets: The System Investigated: • Adopt the clustering of cis-regulatory modules as a method to identify the functional motifs; • Test the method with some known real CRM regions; • Search the genome to discover CRMs and confirm the results by experiments. • The early Drosophila embryo. • Five transcriptional factors: Bcd, Cad, Hb, Kr and Kni are investigated.
Methods: • Collecting Transcription Factor Binding Sequences in preceding lab works and doing Alignment; • Construction of Position Weight Matrices (PWM) for the conserved motifs. • Test the method with the known CRMs; • Genome-wide Searching for unknown regulatory regions; • mRNA Hybridization and Microarray hybridization to test whether the predicted regions are near to genes under regulation of the Transcription Factors; • One special case: giant gene, further investigated by Transgenics and Mutant Embryo.
Step1: Collection and Alignment of TF Binding Sites • Bcd, Cad, Hb, Kr, Kni binding sequences are determined by in vitro DNAse protection assays; • The sequences are aligned with MEME.
Step 2: Construction of PWMs and Searching: • Patser is used to construct the Position Weight Matrix; • Cis-Analyst is used to identify the potential binding sites matching to the PWM in the Drosophila genome. • A user-defined cutoff parameter (site_p) to eliminate predicted low-affinity sites; • Search the sequence with a specified window length; • Retain the windows that contain at least min_sites binding sites; • Merge all overlapping windows into a “cluster”.
Successful Result: 14/19with the searching criteria: window-size=700 bp, number of predicted sites>=13
Step 4: Genome-wide Searching: • 28 clusters identified; • 23 out of 28 fall in regions between genes; • 5 in the intron regions; • 49 genes in the nearby regions.
Step 5: Examine the expression pattern of the 49 genes by RNA in situ hybridization and microarray hybridization: • The 49 genes are examined by hybridizations to see whether they show the pattern of under regulation of the TFs; • 10 out of the 28 clusters are near to at least one gene show the anterior-posterior expression pattern (Under regulation of the five TFs).
Step 6: The special case: giant gene • The posterior expression is regulated by Cad,Hb,Kr; • The cis-regulatory sites are still unknown; • The predicted CRM nearest to the giant gene is cloned to the upstream of lacZ reporter gene. • The lacZ gene show a similar expression pattern as the giant mRNA. • +/+ Kr/Kr
Conclusions: • Binding site clustering is an effective method to identify cis-regulatory modules; • A major block is the paucity of the binding data for most transcription factors, which need a systematical work; • The real CRM structures is more complex, it needs to incorporate more complex rules in the method.
Reference • Berman, B.P., Nibu, Y. et al. 2001. Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. P. N. A. S. 99:757-762