320 likes | 382 Views
Genome representation and variant identification. Deanna M. Church, NCBI. The Reference Assembly is NOT Static. NCBI35 (hg17). NCBI36 (hg18). GRCh37 (hg19). GRCh37.p9. Image credit: http :// www.tohlejokes.com. http://genomereference.org. Resolved: 716 Open: 697.
E N D
Genome representation and variant identification Deanna M. Church, NCBI
The Reference Assembly is NOT Static NCBI35 (hg17) NCBI36 (hg18) GRCh37 (hg19) GRCh37.p9
Resolved: 716 Open: 697
Studies Methods Analysis Publications Samples Submitted assembly Variant Region nsv531833 type: CNV Variant Regions Variant Calls: nssv577112 type: copy number gain Method: OligoaCGH Analysis: Probe signal intensity phenotype: Autism; etc. Clinical: Pathogenic Copy Number: 3 Variant Calls: nssv580124 type: copy number loss Method: OligoaCGH Analysis: Probe signal intensity phenotype: Autism. Clinical: Pathogenic Copy Number: 1 Variant Calls
Variant Call Ambiguity start stop Probes with decreased signal intensity Probes with expected signal intensity breakpoint breakpoint Inner start Inner stop Outer start Outer stop Inner start Inner stop
Variant Call Ambiguity Fosmid clone (40 Kb +/- 1 Kb) Clone has an insertionrelative to the genome 20Kb Clone has a deletionrelative to the genome 60 Kb Outer start Outer stop
Shotgun sequence Assemble GAPS “finishers” go in to manually fill the gaps, often by PCR BAC insert BAC vector
GRCh37 (hg19) NCBI36 (hg18)
AL139246.20 NCBI35 (hg17) GRCh37 (hg19) AL139246.21
Build sequence contigs based on contigs defined in TPF (Tiling Path File). Check for orientation consistencies Select switch points Instantiate sequence for further analysis Switch point Consensus sequence
nsv832911 (nstd68) Submitted on NCBI35 (hg17)
Moved approximately 2 Mb distal on chr15 NCBI35 (hg17) Tiling Path NC_0000015.8 (chr15) Gap Inserted Removed from assembly GRCh37 (hg19) Tiling Path Added to assembly NC_0000015.9 (chr15) HG-24
Sequences from haplotype 1 Sequences from haplotype 2 Old Assembly model: compress into a consensus New Assembly model: represent both haplotypes
nsv532126 (nstd37) NCBI36NC_000004.10 (chr4) Tiling Path TMPRSS11E2 TMPRSS11E2 TMPRSS11E TMPRSS11E GRCh37NC_000004.11 (chr4) Tiling Path AC147055.2 AC079749.5 AC021146.7 AC134921.1 AC074378.4 AC093720.2 AC079749.5 AC147055.2 AC019173.4 AC021146.7 AC134921.2 AC140484.1 AC093720.2 AC074378.4 GRCh37: NT_167250.1 (UGT2B17 alternate locus) AC021146.7 AC019173.4 AC074378.4 AC226496.2 AC140484.1 Xue Y et al, 2008
GRCh37.p9 81 FIX Patches 71 NOVEL Patches
1q32 1q21 1p21 1p21 patch alignment to chromosome 1 Dennis et al., 2012
How dbVar* manages data Search Term *and most other NCBI databases too
Variant submitted on NCBI35 (hg17) Failed to remap to NCBI36 (hg18) Successful remap to GRCh37 (hg19)
No results in ‘normal’ dbVar search Genome Sensor predicts this is a location -> points to dbVar Genome Browser
Acknowledgements dbVar GRC NCBI NCBI John Lopez Tim Hefferon John Garner Chao Chen George Zhou Victor Ananiev Valerie Schneider Nathan Bouk Hsiu-Chuan Chen Collaborators Collaborators DGVa DGV TGI-WU WTSI EBI ISCA NCBI Genomes, Viewers and Variation groups