230 likes | 383 Views
Tools for inferring protein structure and function. Phobius General motif searches PFAMS. First, a little vocabulary: Scale and protein structure. Listed small to large Motif ~5 – 25 aa Domain ‘neighborhood’ Subunit single peptide chain Holoenzyme single or multiple peptide chains
E N D
Tools for inferring protein structure and function Phobius General motif searches PFAMS
First, a little vocabulary:Scale and protein structure • Listed small to large • Motif ~5 – 25 aa • Domain ‘neighborhood’ • Subunit single peptide chain • Holoenzyme single or multiple peptide chains plus cofactors
Objective(s) • To introduce some easy-to-use web tools to infer gene function • To use these tools on ‘mystery genes’ to uncover their function
Mystery genes • 3 genes are available from my page on the Wiki • Download the word document titled ‘Mystery Genes’ • We will use our tools to uncover their functions
What do the mystery genes encode? • Subunits of ABC transporter • Periplasmic solute-binding protein • Transmembrane ‘channel’ protein • ATP-hydrolyzing cytoplasmic component http://www.biologie.uni-regensburg.de/Mikrobio/Thomm/Buttons/reg_fig1.gif
Transmembrane prediction: Hydrophobicity ‘motif’ • Is this protein a transmembrane protein? • Hydrophobic alpha helices? • Typically 10-25 aa long http://www.bio.davidson.edu/Courses/Molbio/MolStudents/spring2003/Bennett/protein1.htm
http://phobius.sbc.su.se/ • Transmembrane helix and signal peptide prediction
Signal (leader) peptides • Basic amino acids (+ charge) at amino terminus • Followed by hydrophobic amino acids • Trigger factor or SRP bind these
http://www.ebi.ac.uk/Tools/pfa/iprscan/ • Interpro Scan • Search a library of motifs from multiple databases
Types of motifs and domains from InterPro Scan • PFAM (a sampling) • RubisCO C terminus • RubisCO N terminus • Various domains of unknown function (DUFS) • Prosite (a sampling) • Helix-turn-helix • Zinc finger • RubisCO active site • Ferredoxin-type iron-sulfur binding domain • Et cetera
At this point, using these tools, you should have some idea of: • Protein location in the cell • Phobius • Perhaps some tantalyzing clues on function • InterProScan
PFAM • http://pfam.sanger.ac.uk/ • Database of protein families • Typically these are families of protein domains • Homologs
Sometimes these domains cover the whole protein, sometimes they do not Image from Pevsner, 2003. Bioinformatics and Functional Genomics
BLASTing PFAM • If click on ‘Sequence Search’ a BLAST window opens
Searching PFAM with keywords • If click on keyword search a keyword search box appears
Results of a keyword search • PFAMS with that word appear in a list • To learn more about a PFAM, click on a link
A PFAM ‘home’ page • Lots of good readin’ and good stuff on the sidebar • Click alignments • Click HMM logo
PFAM alignments • Select HTML viewer • You’ll see the alignment that forms the basis for this PFAM • Often includes structural info • Sometimes includes active site info • (scroll down on alignment to find it)
PFAM HMM logos • A visual of the trends apparent in the alignment • Conserved residues are HUGE • Less conserved residues are small
What is a HMM? • States • Varies by application • E.g., exon, splice site, intron • Position in alignment • Emission probabilities • Probability of each residue within a particular state • Transition probabilities • Probability of moving on to the next state S. Eddy, Nature Biotechnology 22: 1315-6
Using the tools, which gene encodes which ABC transporter subunit? • Phobius • InterProScan • PFAM • ProSite