1 / 23

Exome sequencing and complex disease :

Exome sequencing and complex disease :. practical aspects of rare variant association studies Alice Bouchoms Amaury Vanvinckenroye Maxime Legrand. What is exome sequencing ?. Exon : coding sequence of the DNA Exome sequencing :

Download Presentation

Exome sequencing and complex disease :

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exomesequencing and complexdisease : practical aspects of rare variant association studies Alice Bouchoms Amaury Vanvinckenroye Maxime Legrand

  2. Whatisexomesequencing ? • Exon : codingsequence of the DNA • Exomesequencing : • Aim : to sequence the coding part of the DNA i.e. the exons

  3. Introduction • GWAS : helpeddiscovercommoncodingvariants • Exomesequencing • Also rare codingvariants • Faster, better • large sample ( > 10 000 individuals) • Before 2010 : only few publications on PUBMED • Now : more than2000 publications on PUBMED 2013 2012 2011

  4. Key questions to askyourself

  5. Study design • State objectives • Focus on extremeoutcomes • Unusualphenotype or traits • BUT : CAREFUL : de novo mutations • Geographical restrictions ?

  6. Study design • Sequencingstrategy ? • Quality of the sample : 20x or greaterlevel of coverage depth of sequencing/person : 60x or greater • Non-codingregions : canstillbeusefull Determineancestries or estimategenotype • 0,2x to 2x

  7. Variant calling • Goal : obtainhigh-qualitygenotypes • Severalsteps: • DNA contamination, DNA fingerprints, good follow-up? • Alignmentwithreferencegenome, calibration of base quality score, removal of duplicate reads.

  8. Variant calling • Afterreadsmapping: • Samplequalitymetrics (spotting of outlierproperties) • Variant calling: • Look for differenceswhereoverlapsappear in alignmentwith the referencegenome

  9. Variant calling • Machine-learning-based classifier: • Polymorphic variants / artifacts • Evaluate metrics : true / false positives • Quality metrics on samples • Recommendation: min depth of coverage 20X • Development of standards for storing sequence data and variant calls

  10. Association analysis • Goal: find functional effects of variants • Score: indicates the effect on the protein function Separation between variants with high damage and the others • If multiple annotations, 3 ways: • Focus on the longest transcript • Focus on the most deleterious effect • Focus on the canonical transcript

  11. Association analysis • Single variant association test Check of quality data • Usual way of processing rare variants: gather them in groups acting on the same gene to do the analysis

  12. Association analysis • 2 methods for processing groups: • Comparison of the number of variantsbetween cases and controls • Comparisonwith chance expectations • Recommendation: at least a test of eachcategorywithdifferentthresholds • If no threshold, variety of frequencycut-offs

  13. Association analysis • Packages available to perform the tests withsubsets of data • Example : • 1. missense, splice, stop alteringvariants • 2. subset of deleteriousvariants • 3. splice, stop alteringvariants

  14. Association analysis • No optimal choices for the analysisbecause of variability of variants and of theircharateristicsbetweengenes. • Permutation-basedapproaches Statisticalsignificance • If no permutation-basedthreshold, p values ≤ 5 10-7 • QQ plots to summarize the results

  15. Approaches for follow-up • To demonstrate association based on the analysedsamples, additionalsamples are needed.

  16. Approaches for follow-up • Exome chip experiments examine most of the varaints, but not very sensitive to non-European populations.

  17. Approaches for follow-up • Statistical imputation Take the base whichhas the highestcorrelationwith the missing one, and assume itis the sameallelethan T (i.e. minor or major). • But again, often not possible for mixed populations

  18. Role of functionalassays • Study the changes in the proteins due to codingvariants • Studywhythese changes result in diverse diseases.

  19. Forwardgenetics • Otherapproach to studyfunctionalvariants • First look atwhichproteins show changes • Thensearch in the DNA sequence for the variant(s)

  20. Discussion • In other articles : • more careful about the samplequality • gain of sensitivity in variant calls if made amongseveralsamples • indels in variant call are the major source of false positive. Needalignmentalgorithmwhichallowsgapped alignement • Check results of association in data bases

  21. Discussion • Because of costs, exomesequencingstudies focus on coding part of the genome. Thus not suitable for non-exonicsequence. (stucturalvariants, chromosomalrearrangements) • Theseproblemswillbepartiallysolvedby the cut in costs of sequencing

  22. references

More Related