210 likes | 499 Views
Searching for transcription factor binding sites with TRANSFAC. George Bell, Ph.D. Bioinformatics and Research Computing Hot Topics – October 2009. Outline. What is known about your favorite TFs? In what regulatory DNA should we search?
E N D
Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Searching for transcription factor binding sites withTRANSFAC George Bell, Ph.D. Bioinformatics and Research Computing Hot Topics – October 2009 Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Outline • What is known about your favorite TFs? • In what regulatory DNA should we search? • How can we search for an inexact sequence motif like a TFBS? • What related resources are available? Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Transcription control is complex Lodish et al. Molecular Cell Biology. Model for cooperative assembly of an activated transcription-initiation complex at the TTR promoter in hepatocytes Kettenberger et al., 2004. (1y1w) Complete RNA Polymerase II elongation complex (12 subunits) Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
TRANSFAC at Biobase Connect from Whitehead network Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
TRANSFAC introduction • created in 1988 • contains information about transcription factors that have been experimentally determined to bind DNA • includes eukaryotic cis-acting regulatory DNA elements and trans-acting factors, in organisms ranging from yeast to humans. • The majority of information has been manually curated from the primary literature. Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Browsing transcription factors Select species Detailed info Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Types of TRANSFAC data • Gene – curated info • Promoter – TSS coordinates from Ensembl, FANTOM, etc. • Functional Region – describes publushed regulatory regions • Composite Element (with two or more nearby binding sites) • Site – describes published TFBSs • ChIP-chip – shows data by target • Matrix – contains published aligned binding sites and positional probabilities Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Transcription factor matrix Example: V$MYOD_01 vertebrate MyoD matrix 1 Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Matrix identifiers • Examples: V$MYOD_01, V$AP1_Q4_01 V$ = vertebrate I$ = insects; P$ = plants; F$ = fungi; N$ = nematodes; B$ = bacteria MYOD = factor or family name 01 = matrix number 1 for MYOD Q* = matrix reliability/quality (1 – 6) Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Matrices are redundant V$MYOD_01 V$MYOD_Q6 V$MYOD_Q6_01 Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Extracting regulatory regions • One, many or all genes? • Promoters or all potential regions (introns, intergenic)? • Sources of genomic sequence: • UCSC genome browser (click on “DNA”) • Ensembl BioMart (“Sequences” for output) • Published datasets Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Starting MATCH Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
MATCH profiles (sets of matrices) Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
MATCH output Core == first 5 most conserved positions Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Creating a custom matrix: input Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Creating a custom matrix: output Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
MATCH Profiler - input Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
MATCH Profiler - output Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
MATCH with our custom profile Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics
Related resources • UCSC Genome Browser (hg18): • “TFBS Conserved” track (human/mouse/rat) • JASPAR (public database of transcription factor binding profiles): • http://jaspar.genereg.net/ • Create a sequence logo: http://weblogo.berkeley.edu • Command-line tools: • TRANSFAC; tffind; HMMER1; MAST (MEME Suite) • Search for “patterns” ( ex: CAxxTGx[TC] ) • EMBOSS: fuzznuc; dreg Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics