300 likes | 582 Views
CBI Tech. Workshop - NGS Special Session. Lesson 5 Genetic Variant Annotation. Linlin Yan ( 颜林林 ) Center for Bioinformatics, Peking University Jun 13, 2011. Outline. Review & Overview Thoughts & Methods Variant Browsing Variant Annotation Association Study More Beyond Demos & Exercises.
E N D
CBI Tech. Workshop - NGS Special Session Lesson 5Genetic Variant Annotation Linlin Yan (颜林林)Center for Bioinformatics, Peking University Jun 13, 2011
Outline • Review & Overview • Thoughts & Methods • Variant Browsing • Variant Annotation • Association Study • More Beyond • Demos & Exercises
NGS Analysis Workflow Sequencer Mapping Short Reads Assembling Call Variants Alignments Contigs / Scaffolds Calculate Expression Call Peaks SNV / CNV / SV Peaks / Regions Expression Profile Annotation
Genetic Variant Analysis Workflow Sequencer • Solexa Pipeline (Lesson 2) • File Format (Lesson 1) • FASTQ / Quality / SAM / ... • Reads Mapping (Lesson 1) • Maq / Bowtie / BWA • Alignment File Manipulate (Lesson 3) • Samtools / BedTools / FastX-tool • Genetic Variant Caller (Lesson 4) • GATK • Genetic Variant Annotation (Lesson 5) • PolyPhen / SIFT / ANNOVAR / PLINK / ... Short Reads Mapping Alignments Call Variants SNV / CNV / SV Annotation
What at the positions? How affect functions? What related to phenotype? More beyond ... => Genome Browser => Variant Annotation => Association Study => Disease: CDCV vs. CDRV What Could Be Inferred from Variants SNV / CNV / SV Genome Annotation Genetic Variants Mutation Effects Disease Phenotype
Genome Browser Online Browsers: • UCSC Genome Browser • http://genome.ucsc.edu/ • Ensembl Genome Browser • http://www.ensembl.org/ • DNAnexus • https://dnanexus.com/genomes/hg18/public_browse Local Browsers: • IGV (Integrative Genomics Viewer) • http://www.broadinstitute.org/igv/
UCSC Genome Browser (http://genome.ucsc.edu/cgi-bin/hgTracks?clade=mammal&org=Human&db=hg19)
Support Formats: BED / bigBed bedGraph GFF GTF WIG / bigWig MAF BAM BED detail Personal Genome SNP PSL UCSC Genome Browser (cont.) (http://genome.ucsc.edu/)
IGV (Integrative Genomics Viewer) (http://www.broadinstitute.org/igv/)
UCSC: Table Browser & Public DB • Retrieve track data in batch • Retrieve sequences in specific regions • Combine regions and/or annotations • Query track data in public MySQL database (http://genome.ucsc.edu/cgi-bin/hgTables)
These are KNOWN variants. How about UNKNOWN variants?
Mutation Effects Prediction • SIFT (Sorting Intolerant From Tolerant) • http://sift.jcvi.org/ • PolyPhen (Polymorphism Phenotyping) • http://genetics.bwh.harvard.edu/pph/ • MAPP(Multivariate Analysis of Protein Polymorphism) • http://mendel.stanford.edu/SidowLab/downloads/MAPP/index.html • SNPs3D • http://www.snps3d.org/
Automatically Variant Annotation ANNOVAR (ANNOtate VARiation) • http://www.openbioinformatics.org/annovar/ • Gene-based annotation • SNPs/CNVs affect protein coding • Region-based annotations • Variants in specific region • Filter-based annotation • Variants reported in dbSNP, 1000 genomes • Filter by SIFT score • Others • Retrieve sequences or cadidate gene list in batch
Between Patients and Normals • Too many variants detected • Most variants are not related to target disease • Comparing MAF (Minor allele Frequency) between patients and normals can indicate related variants
Association Study Tools • PLINK • http://pngu.mgh.harvard.edu/~purcell/plink/ • gPLINK • http://pngu.mgh.harvard.edu/~purcell/plink/gplink.shtml • Haploview • http://www.broadinstitute.org/scientific-community/science/programs/medical-and-population-genetics/haploview/haploview
More Beyond: Find Out Causal Gene • Two Disease Hypothesis Models: • CDCV: Common Disease, Common Variant • CDRV: Common Disease, Rare Variant • To Find Out Rare Variant • From GWAS (Microarray) to Sequencing • More Samples • Pool-up analysis methods
Rare Variant Analysis • Gene-Based Method (PMID:17660818)
Pool Up The Rare Variants • Fixed-Threshold Method (Li, et al, 2008) • Weighted Approach (Madsen, et al, 2009) • Variable-Threshold Method (VT-Test) (Price, et al, 2010) • http://genetics.bwh.harvard.edu/rare_variants/
Demos • Data Preparing • Reads Mapping • Variant Calling • BED/Wig generation
Demos (cont.) • UCSC Genome Browser • Uploading BAM/BED/Wig • IGV Genome Browser • Loading BAM/BED/Wig • UCSC Table Browser • Retrieve track data • Retrieve coding sequences • UCSC Public Database
Demos (cont.) • SIFT & PolyPhen • ANNOVAR • PLINK • VT-Test