1 / 22

Questions to be addressed

Questions to be addressed. Can multiple D genes be inserted? Violation of 12/23 rule Can D genes be inserted backwards? Is there a D gene preference? Is there a reading frame preference for D genes? If yes, is it part of the gene rearrangement? Who is doing the end trimming?. Data sets.

dougal
Download Presentation

Questions to be addressed

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Questions to be addressed • Can multiple D genes be inserted? • Violation of 12/23 rule • Can D genes be inserted backwards? • Is there a D gene preference? • Is there a reading frame preference for D genes? • If yes, is it part of the gene rearrangement? • Who is doing the end trimming?

  2. Data sets • 6329 clonally unrelated rearrangements. • 1968 un-mutated functional • 3707 mutated functional • 274 un-mutated non-functional • 380 mutated non-functional

  3. P nucleotides

  4. How many types of D genes? • Conventional D genes • Identified in 81% of sequences unmutated sequences, 64% of mutated sequences • Inverted D genes • Long inverted D genes can not be excluded • Two D genes • D genes with irregular RSS (DIR) • Chromosome 15 OR

  5. D gene usage 27 conventional D genes, 34 known alleles

  6. + D-gene usage and JH gene • JH proximal D genes more often recombined to JH4 than JH6 and JH distal D genes more often to JH6

  7. Inverted D genes are not used! (or used extremely infrequent) Inverted (palindrom) D genes

  8. D genes with irregular RSS (recombinaation signal sequence) (DIR) • Very long, >180 bp • Contain a family 1 D gene >DIR1 (in between D6-6 og D1-7) GGTGTTCCGCTAGCTGGGGCTCACAGTGCTCACCCCACACC TAAAACGAGCCACAGCCTCCGGAGCCCCTGAAGGAGACCCC GCCCACAAGCCCAGCCCCCACCCAGGAGGCCCCAGAGCACA GGGCGCCCCGTCGGATTCTGAACAGCCCCGAGTCACAGTG GGTATAACTGGAACTAC >IGHD1-7-01|X13972|IGHD1-7-01|Homo sapiens|F|D-REGION GGTATAACTGGAACTAC

  9. D genes with irregular RSS (DIR) • Very long, >180 bp • Contain a family 1 D gene • Found in 1% of sequences, inverted in 1.2% • Some explained as family 1 gene plus N additions • Median length of remaining not different from in permutated sequences • => No evidence for use of DIR

  10. Two D genes • 2 D genes found in 1% of sequences • Frequency not different from permutated sequences • Some explained as one long D genes with deletion • Some not possible due to D genes location • Median lengths of longest gene resembles normal D genes, shortest resembles permutated sequences

  11. Multiple D genes • 65 sequences with two D genes • Average length of shortest D genes: 11.6bp • Average length of longest D genes: 18.8bp • Average length of D genes in permuted sequences: 11.3bp • Average length of D genes in normal sequences: 17.8bp • => multiple D genes are not present!!! V-gene Longest-D Shortest-D J-gene

  12. Chromosome 15 OR (open reading frames) • 10 OR resembling D genes on chromosome 15 • High homology to conventional D genes

  13. >IGHD5-12-01|X13972|IGHD5-12-01|Homo sapiens|F|D- 275 aa vs. >IGHD5-OR15-5 |X55583 og X55584 253 aa 91.3% identity; Global alignment score: 1563 10 20 30 40 50 FINDFASTADTEMPLATESDATIGHD-SATSEPMNIELHMEPRECTSMNIELANTIBDYH :::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::: FINDFASTADTEMPLATESDATIGHDRSATSEPMNIELHMEPRECTSMNIELANTIBDYH 10 20 30 40 50 60 60 70 80 90 100 110 MMDGENANALYSISIRIXRGANISMIPCMMANDLINEPARAMETERSSETTLARGVISAN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: MMDGENANALYSISIRIXRGANISMIPCMMANDLINEPARAMETERSSETTLARGVISAN 70 80 90 100 110 120 120 130 140 150 160 170 AMELISTDEFALTISNAMEXEXCLDENAMESMFINDENTRYBYCMMNSBMATCHAFINDA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: AMELISTDEFALTISNAMEXEXCLDENAMESMFINDENTRYBYCMMNSBMATCHAFINDA 130 140 150 160 170 180 180 190 200 210 220 230 LLENTRIESSNAMEMSTBEINSTARTFNAMENMBERFFASTAENTRIESREADFRMFILE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: LLENTRIESSNAMEMSTBEINSTARTFNAMENMBERFFASTAENTRIESREADFRMFILE 190 200 210 220 230 240 240 250 260 270 DTEMPLATESDATGTGGATATAGTGGCTACGATTAC ::::::::::::: DTEMPLATESDAT----------------------- 250

  14. Chromosome 15 OR (open reading frames) • 10 OR resembling D genes on chromosome 15 • High homology to conventional D genes • Very few OR15 in un-mutated sequences • Median length not different from hits in permutated sequences • => No evidence for use of OR15 genes

  15. D gene reading frames • The recombination mechanism utilises each D gene reading frame at same frequency

  16. N nucleotide dependence on end nucleotide Position X+1 Position X A T G C P-value A 0.2920.146 0.292 0.271 0.04 T 0.260 0.2900.207 0.243 0.016 G 0.204 0.172 0.4530.172 0.0004 C 0.136 0.204 0.231 0.430<0.0001 Expected 0.210 0.201 0.292 0.298 - N addition is not random but dependent on end nucleotide

  17. Trimming of gene ends Avg. 3.8 bp • Trimming depends on the gene-end and can not only be described by a simple removal of one nucleotide at a time

  18. VDJsolver performance Unmutated sequences #: p<0.01 §: P<0.001 Mutated sequences

  19. Results regarding recombination and diversity and open questions • DIR, OR15, multiple D genes and VH replacements are not used at a significant rate • Inverted D genes are used rarely • All D genes not used at same frequency • What determines if a D genes is used? • D gene usage somewhat dependent on JH gene • Does multiple D-J recombination steps take place? • All D gene reading frames used at equal rate at the recombination step • At what step in the development happens the selection for the hydrophilic reading frame?

  20. Results regarding recombination and diversity and open questions (cont.) • N addition not random but dependent on end nucleotide • Does nucleotide availability or the specificity of TdT determine the N addition? • Trimming not random but dependent on gene and sequence • What enzyme(s) is responsible for the trimming?

  21. Numbering Schemes • The Kabat numbering scheme is a widely adopted standard for numbering the residues in an antibody in a consistent manner. However the scheme has problems! • The Chothia numbering scheme is identical to the Kabat scheme, but places the insertions in CDR-L1 and CDR-H1 at the structurally correct positions. This means that topologically equivalent residues in these loops do get the same label (unlike the Kabat scheme). • The IMGT unique numbering for all IG and TR V-REGIONs of all species relies on the high conservation of the structure of the variable region. This numbering, set up after aligning more than 5 000 sequences, takes into account and combines the definition of the framework (FR) and complementarity determining regions (CDR), structural data from X-ray diffraction studies, and the characterization of the hypervariable loops. http://www.bioinf.org.uk/abs/#kabatnum http://imgt.cines.fr/

  22. Identification of CDR regions Indentifying the CDRs CDR-L1 Start Approx residue 24 Residue before is always C Residue after is always W. Typically WYQ, but also, WLQ, WFQ, WYL Length 10 to 17 residues CDR-L2 Start always 16 residues after the end of CDR-L1 Residues before generally IY, but also, VY, IK, IF Length always 7 residues CDR-L3 Start always 33 residues after end of CDR-L2 Residue before is always C Residues after always FGXG Length 7 to 11 residues CDR-H1 Start Approximately residue 31 (always 9 after a C) (Chothia/AbM defintion starts 5 residues earlier) Residues before always CXXXXXXXX Residues after always W. Typically WV, but also WI, WA Length 5 to 7 residues (Kabat definition); 7 to 9 residues (Chothia definition); 10 to 12 residues (AbM definition) CDR-H2 Start always 15 residues after the end of Kabat/AbM definition of CDR-H1 Residues before typically LEWIG, but a number of variations Residues after K[RL]IVFT[AT]SIA (where residues in square brackets are alternatives at that position) Length Kabat definition 16 to 19 residues (AbM definition and most recent Chothia definition ends 7 residues earlier; earlier Chothia definition starts 2 residues later and ends 9 earlier) CDR-H3 Start always 33 residues after end of CDR-H2 (always 3 after a C) Residues before always CXX (typically CAR)

More Related