280 likes | 384 Views
PolyPhen and SIFT: Tools for predicting functional effects of SNPs. Epi 244 Spring 2009 Sam S. Oh. Human genome variation. 3.2 billion base pairs (bp) 99.9% similarity across individuals 3.2 million bp dissimilar ~11 million SNPs Coding vs. non-coding (intron and intergenic regions)
E N D
PolyPhen and SIFT: Tools for predicting functional effects of SNPs Epi 244 Spring 2009 Sam S. Oh
Human genome variation • 3.2 billion base pairs (bp) • 99.9% similarity across individuals • 3.2 million bp dissimilar • ~11 million SNPs • Coding vs. non-coding (intron and intergenic regions) • Most are synonymous Frazer et al. Nat Rev Genet, 2009;10:241-251
Example: sickle-cell anemia • A to T SNP of beta-globin gene results in glutamate (hydrophilic) to valine (hydrophobic) substitution
Example: MTHFR • Folate metabolism
Note Build number (currently Build 130) Highlight all refSNP numbers (use scroll bar) and copy
SIFT • Sorting Intolerant From Tolerant • Predicts tolerability of AA substitution effects (i.e., non-synonymous SNPs) based on • Sequence homology • Physical properties of amino acids • Can be applied to naturally occurring nonsynonymous polymorphisms and laboratory-induced missense mutations
Compare Build numbers Copy all SNP IDs and paste into SIFT. Choose “Submit Query”
Getting more info for rs2274974 Enter “rs2274974”
Position of SNP in mRNA, protein, contig Note AA1, AA2, and position Flanking sequence, IUPAC code, flanking seq Allele info Build number mRNA name Protein name Contig name Scroll down Select protein
Paste FASTA-formatted protein sequence Enter AA substitution [Letter1-position-Letter2]
Substitution occurs at AA 566 Scroll down
Check tolerance of AA substitutions Scroll down
“Substitution at pos 566 from G to E is predicted to AFFECT PROTEIN FUNCTION with a score of 0.01. Tolerance of specified substitution
Polymorphism Phenotyping • Tool for prediction of possible impact of amino acid substitution (i.e., non-synonymous SNPs) on protein structure and function based on: • Amino acid sequence • What part of the protein did the SNP occur? (E.g., active site, binding site, transmembrane region) • Multiple alignments with homologous proteins and mammalian orthologues • How compatible is the substitution based on proteins of comparable sequence? • 3D structural properties with the substituted amino acid • What is the substitution’s effect on the protein’s physiochemistry? (E.g., hydrophobicity, electrostatic interactions, ligand binding)
Four potential predictions • Probably damaging • It is with high confidence supposed to affect protein function or structure • Possibly damaging • It is supposed to affect protein function or structure • Benign • Most likely lacking any phenotypic effect • Unknown • Lack of data do not allow PolyPhen to make a prediction
Copy FASTA-formatted protein sequence Enter AA position, ancestral AA, and substituted AA
In dbSNP Build 129, corresponds to protein NP_005948.3 Enter SNP rs#
References • NCBI dbSNP • http://www.ncbi.nlm.nih.gov/sites/entrez • SIFT • http://sift.jcvi.org/ • PolyPhen • http://genetics.bwh.harvard.edu/pph/index.html