1 / 15

Encode variation analysis

Encode variation analysis. Analysis goals. Quantify genetic variation in ENCODE regions Detect selective constraint in ENCODE features Develop rules for interpretation of functional variation Motivate experiments to test functional variation. Data. Encode SNPs (HapMap resequencing)

holt
Download Presentation

Encode variation analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Encode variation analysis

  2. Analysis goals • Quantify genetic variation in ENCODE regions • Detect selective constraint in ENCODE features • Develop rules for interpretation of functional variation • Motivate experiments to test functional variation

  3. Data • Encode SNPs (HapMap resequencing) • 5kB HapMap SNPs • DIPs • Gene expression variation

  4. Metrics of variation • Derived allele frequency spectrum (Manolis) • Diversity/Het (Ewan) • SNP density (Ewan, others) • DIP density (Jim, Taane) • LD/Recombination (Daryl/Oxford) • Regions of contiguous DNA without variation (Manolis) • Accelerated (positively selected?) regions (Manolis) • Standard tests of neutrality McDonald Kreitman/Tajima’s D etc (Mike, others) • Other non-parametric tests of selection (Andy) • Tagging (Paul)

  5. Analysis plans Analysis wrt to genomic features • Calculate variability in a large number of genomic features with all metrics • Correlate variability metrics with “intensity” of feature (e.g. levels conservation with levels of variability) • Variation, alternative spicing and expression • Distance effects from genomic features • Association of gene expression with SNPs (some is in UCSC and some will be provided by Manolis at the workshop) Analysis independent of genomic features (in principle) • Tag SNPs and comparison of resequencing data to 5 Kb map. Here it will be a good idea to see how the 5 Kb map captures variation within genomic elements. If we really aim to capture variation mainly in functional genomic elements (e.g. known regulatory regions, or nonsym SNPs) how can we modify the tag algorithms? • General description of levels of variation wrt to the functional content of the 44 ENCODE regions

  6. Diversity in features Ewan Birney av2pq/SNP av2pq/pos #snps Promoters : 0.15 0.00045 856 Region Rnd2 : 0.16 0.00041 737 Completely Rnd: 0.16 0.00045 1584 Exons : 0.14 0.00039 635 RRnd Exons : 0.15 0.00040 636 Overall : 0.16 0.00042 16609

  7. Derived allele frequency spectrum CNS intersection P = 0.003

  8. Derived allele frequency spectrum Transfrags union P = 0.204

  9. Heterozygosity Taane Clark

  10. Indels

  11. Regions accelerated in humans

  12. Nuria Lopez selective constrains differ for genes expressed in different tissues

  13. Genes expressed in more tissues have more selective constrains (lower dN)

  14. Tagging Paul de Baker • ENCODE is near-complete inventory of common (MAF≥5%) sites • How well do tag SNPs picked from thinned versions of ENCODE (to mimic ascertainment of Phase I and II) capture: • all common variants • functional sites

  15. Coverage of common variants by tags picked from simulatedPhase I and II HapMap

More Related