1 / 29

Variant Analysis Introduction

Variant Analysis Introduction. Deanna M. Church Staff Scientist, NCBI. Short Course in Medical Genetics 2013. @ deannachurch. Steve Sherry, NCBI. BAM. FASTQ. BAM. FASTQ. VCF. VCF. VCF. VCF. http:// www.bioplanet.com / gcat. http:// www.ncbi.nlm.nih.gov /variation/tools/1000genomes.

hogan
Download Presentation

Variant Analysis Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Variant Analysis Introduction Deanna M. Church Staff Scientist, NCBI Short Course in Medical Genetics 2013 @deannachurch

  2. Steve Sherry, NCBI BAM FASTQ BAM FASTQ VCF VCF VCF VCF

  3. http://www.bioplanet.com/gcat

  4. http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes

  5. Variation Databases • Collection of small nucleotide variation (SNVs) • Typically <50 bp • Some are polymorphic • Some are rare • Some are errors • Submissions clustered to make reference variants (rsIDs) http://www.ncbi.nlm.nih.gov/snp

  6. Variation Databases Blue variants are all T insertions Submitters submit in different part of the polyT tract Need additional analysis to cluster these

  7. Variation Databases • Collection of large-scale variation • Breakpoint ambiguity • Complex variants (chromothripsis) • Challenging to compare variants from different methods • No reference variants (yet) http://www.ncbi.nlm.nih.gov/dbvar

  8. Variant Call Ambiguity start stop Probes with decreased signal intensity Probes with expected signal intensity breakpoint breakpoint Inner start Inner stop Outer start Outer stop Inner start Inner stop

  9. Variant Call Ambiguity Fosmid clone (40 Kb +/- 1 Kb) Clone has an insertionrelative to the genome 20Kb Clone has a deletionrelative to the genome 60 Kb Outer start Outer stop

  10. Variation Databases http://www.ncbi.nlm.nih.gov/clinvar

  11. How confident am I that my variant call is correct?

  12. http://www.bioplanet.com/gcat

  13. Available NGS Aligners already out of data Fonseca et al., 2012

  14. Alignment Test Align back to the source Simulated Reads Good: know where the reads go Not so good: hard to simulate real data http://www.bioplanet.com/gcat

  15. http://www.bioplanet.com/gcat

  16. Variant Calling Test

  17. Variant Calling Test Transition /Transversion ratio (Ti/Tv) A C Random: 0.5 Whole Genome: 2.0 – 2.1 Exome: 3-3.5 T G Transversions Transitions

  18. Variant Calling Test Note: Difficult to test variant calling independentlyfrom the aligner as they are often coupled.

  19. Variant Calling Test Benchmarking on known samples NA12878 NA19240

  20. Target audience: Clinical testing labs Submissions from: Clinical and Research labs Concordant NA Discordant Calls Tests cSRA http://www.ncbi.nlm.nih.gov/variation/tools/get-rm

  21. Variant Analysis Pipelines: Galaxy https://main.g2.bx.psu.edu/

  22. Variant Analysis Pipelines: Galaxy • Workflows • Save them • Share them • Can run on Amazon Cloud • Large community Reproducibility

  23. Annotating Variants NC_000001.10:g.170508561T>A NC_000001.10:g.170508573T>C NC_000001.10:g.170508656G>T NC_000001.10:g.170508724T>C

  24. Annotating Variants Molecular Consequences (often predicted) Damaging amino acid change Affect a splice site Change a regulatory feature Functional Consequences (typically asserted) Experiments show the change affects expression Allele associated with a disorder Allele shown to affect some function

  25. Annotating Variants MAPKAPK2 DYRK3

  26. Annotating Variants Upload your list of variants, get back Is the variant known? Is the variant predicted to be deleterious to a protein (SIFT, PolyPhen) Overlap with predicted regulatory region HGVS expressions http://www.ensembl.org/info/docs/variation/vep/index.html

  27. Annotating Variants Upload your list of variants, get back Is the variant known? Does the allele have a molecular consequence (change AA, nonsynonymous) HGVS expressions ClinVar information Available Genetic Tests Publications http://www.ncbi.nlm.nih.gov/variation/tools/reporter

  28. Take home messages • Lots of methods for sequence alignment • Lots of methods for variant calling • Typically developed to use a particular aligner • Different data sources can affect your annotation

More Related