240 likes | 485 Views
cis -regulatory element study in transcriptome. Jin Chen CSE891- 001 Fall 2012. What is Cis-element. Latin word “cis” means "on the same side as". Courey and Jia (2001).
E N D
cis-regulatory element study in transcriptome Jin Chen CSE891-001 Fall 2012
What is Cis-element Latin word “cis” means "on the same side as" Coureyand Jia (2001) A cis-regulatory element or cis-element is a region of DNA or RNA that regulates the expression of genes located on that same molecule of DNA
Cis-element properties • Typically found in 5’ untranscribed region of the gene (promoter region) • Can be specific sites for binding of activators or repressors • Position and orientation of cis-element relative to transcriptional start site is usually fixed
Cis-element properties • Short sequences • Recurring patterns • Sequence-specific binding sites
Cis-element Representations Sequence 1: Sequence 2: Sequence 3: Sequence 4: Sequence 5: A G T A T A A G A T T A C G A C T C A G T G T A A G T G T G Consensus sequence: A G W N T A Probability Matrix & sequence logo:
Cis-element Representation 1 • Consensus based method • Refer to a sequence that matches all examples of the binding site closely but not exactly • Trade-off between ambiguity and sensitivity IUPAC codes
Cis-element Representation 2 • Sequence logos • A visual representation of the probability matrix • The total height of each column is proportional to its information content http://www-lmmb.ncifcrf.gov/~toms/sequencelogo.html
Cis-element matching/discovery • Pattern Matching • Discovery patterns in sequences from co-regulated genes using JASPAR and TRANSFAC matrices • Pscan • Pattern Discovery • Discovery patterns in sequences from co-regulated genes without using known patterns • MEME, hmmbuild
Pattern Matching http://www.slideshare.net/Stewbacca/dna-motif-finding-2010
Cis-element evolution • Composition • Location • Modules chiken aA mouse aA mouse d1 Gene control regions for eye lens chrystallins Molecular Biology of the Cell, Alberts et al., 4th ed.
Large Scale Analysis • Identify 264 co-regulated gene groups in S. serevisiae • Putative cis-regulatory elements • 80 known consensus binding sites • 597 elements by motif discovery with MEME • Score enrichment of genes containing each putative element- 42 cis-elements in 35 unique groups • Orthologous modules in other species • Enrichment of orthologous modules A. P. Gaschet al., PLoS Biol., 2004
Proteasome GGTGGCAAA Rpn4p Conservation of S. cerevisiae motifs G1 phase cell cycle ACGCG MCB Amino acid biosynthesis TGACTM Gcn4p Nitrogen source GATAA GATA factors
Positions of binding sites • Non random distribution • Similar across species • No correlations in locations across species
Spacing between binding sitesin Methionine Biosynthesis genes • Small distance between Cbf1p and Met31/32p • Conserved across species • Independent of exact positions
Control of iron metabolism in Mycobacterium tuberculosis. Rodriguez, Marcela. Trends in Microbiology, 2006.
Exponential distribution: “Pearson type III distribution”: Poisson Method for module discovery Look for matches to consensus sequences Mcm1 : DCCYWWWNNRG Ste12 : TGAAACA Random DNA sequence: • Wagner A (1999) Bioinformatics 15(10): 776-784
Cister & Comet DNA sequence segment Cluster model: Poisson-distributed cis-elements, embedded in random DNA • Frith MC, Hansen U, Weng Z (2001) Bioinformatics 17(10): 878-889. Frith MC, Spouge JL, Hansen U, Weng Z (2002) Nucleic Acids Research