400 likes | 541 Views
Mapping of BAC-end sequences on the Chicken Genome: American Alligator, Painted Turtle and Emu. Charles Chapus. Lab meeting 04/07/2007. Introduction BAC Librairies Methodology Blast results Paired Blast results Conclusion. Introduction BAC Librairies Methodology Blast results
E N D
Mapping of BAC-end sequences on the Chicken Genome: American Alligator, Painted Turtle and Emu • Charles Chapus Lab meeting 04/07/2007
Introduction • BAC Librairies • Methodology • Blast results • Paired Blast results • Conclusion
Introduction • BAC Librairies • Methodology • Blast results • Paired Blast results • Conclusion
Sauropsids relationships The chicken is the closest species to reptiles, for the moment, for which the genome has been sequenced
American AlligatorAlligator Mississippiensis • Karyotype: 16 macrochromosomes & no microchromosome • No sex chromosome • Genome size: 2.49 Gb Valleley et al, Chromosoma, 1994
Painted TurtleChrysemys picta • Presumed karyotype: 24 or 26 pair of chromosomes (12-14 macrochromosomes & 12-14 microchromosomes) • No sex chromosome • Size of the genome: 2.57 Gb Bickham & Baker, Chromosoma, 1976
Emudromaius novaehollandiae • Karyotype: 40 chromosomes (10 macro- & 30 microchromosomes) • Presence of sex chromosomes: W & Z (5th largest) • Size of the genome: 1.63 Gb Tagaki et al, Chromosoma, 1972
Introduction • BAC Librairies • Methodology • Blast results • Paired Blast results • Conclusion
Alligator/Turtle • Alligator mississippiensis • 1675 BAC clones sequenced • 3128 BAC-end sequences • 2.5 Mb (average length: 770 nt) • Chrysemys picta • 1828 BAC clones sequenced • 3461 BAC-end sequences • 2.4 Mb (average length: 703 nt)
Emu • Dromaius novaehollandiae • 8 plates have been BAC-end sequenced • After cleaning of the sequences: 5288 BAC-end sequences • 2936 BAC clones • 3.5 Mb (average length: 662 nt)
Tuatara/Garter Snake • Sphenodon Punctatus • 5172 cleaned BAC-end sequences • 3 Mb • 7.61% of repeat elements • Thamnophis Sirtalis • 3867 cleaned BAC-end sequences • 2.4 Mb • 5.29% of repeat elements
Introduction • BAC Librairies • Methodology • Blast results • Paired Blast results • Conclusion
~ same length Chicken Genome (Assembly 2.1) 1.25 Gb Hits for each end (evalue<10-5) Blast Mapping Blastn/tBlastx Protocol BAC Clone BAC length ~ 150 to 160 kb Chicken genome: 38 chromosoms+1 pair of sex chromosoms sequenced genome: 1.1Gb size chr W: 258 kb from 10 Mb chr Z: 76 Mb
Litterature examples • Human/Chimpanzee comparaison (fosmid clones) (Newman et al, Gen. Res., 2005) • Human/Mouse BAC-end comparaisons and repeat elements analysis (Zhao et al, Gen. Res., 2001) • Papaya/Arabidopsis thaliana (BAC-end) (Lai et al, Mol. Gen. Genomics, 2006) • Gimpseng/Arabidopsis thaliana (BAC-end) (Hong et al, Mol. Gen. Genomics, 2004)
Bioinformatics • BAC libraries sequences and all BLAST results are stored in MySQL databases • Python scripts are used to interrogate databases, parse sequences and compute Blast analysis • Results are computed using Python/JMP/Repeatmasker
Introduction • BAC Librairies • Methodology • Blast results • Paired Blast results • Conclusion
Alligator 75% have less than 19 hits Emu 75% have less than 2 hits Turtle 75% have less than 5 hits Number of blast hit per BAC-end sequences
Emu Turtle Alligator Length of blast hits Distribution of the Blast hit length per species Similar distributions (median, variance).In Emu, much more longer hits
Identities of the blast Hits Alligator Turtle Percentage of identities of the blast hits per species Emu
Introduction • BAC Librairies • Methodology • Blast results • Paired Blast results • Conclusion
Definition of a paired BAC-end hit • A paired BAC-end hit is a BAC clone where both end sequences have a significant hit on the same chicken chromosome conserving the orientation of the BAC clone • A paired BAC-end hit should have a length approximately close to the average BAC clone length
Number of paired blast hits • Alligator mississippiensis • 63 BAC clones have a paired hit (22,881 in total) Histogram of the number of hits per BAC clone
Number of paired blast hits • Chrysemys picta • 60 BAC clones have a paired hit (5,751 in total) Histogram of the number of hits per BAC clone
Number of paired blast hits • Dromaius novaehollandiae • 545 BAC clones have a paired hit (44,099 in total) Histogram of the number of hits per BAC clone
Different types of paired hits 34 “good” paired BAC-end hits Alligator
Different types of paired hits 27 “good” paired BAC-end hits Turtle
Different types of paired hits 479 “good” paired BAC-end hits Emu
Alligator/Emu t test p<<0.0001 Alligator/Turtle t test p<0.0003 Emu/Turtle t test p<<0.0001 Comparaison of the length of good paired hits Distributions of length are significantly different (Van de Waerden Test, p<0.0001)
Correlation between the chicken paired hit length and the BAC clone length
PDZRN4 (PDZ domain containing RING finger 4) similar to RIKEN cDNA 9430097H08; hypothetical protein MGC28016; similar to RIKEN cDNA D130059P03 gene SRY (sex determining region Y)-box 5 EPHA6 (Eph receptor A6) 144039358-144039558 D26321.1 very conserved across vertebrate LOC418979 (dynein, cytoplasmic, heavy chain 2); DCUN1D5 (DCN1, defective in cullin neddylation 1, domain containing 5 (S. cerevisiae)) LOC417920 (similar to PCTAIRE protein kinase 2; serine/threonine-protein kinase PCTAIRE-2; protein kinase cdc2-related PCTAIRE-2) ODZ4 (odz, odd Oz/ten-m homolog 4 (Drosophila)) Mapping and gene content Alligator Turtle
Introduction • BAC Librairies • Methodology • Blast results • Paired Blast results • Conclusion
Conclusion • Framework very easy to adapt to new libraries • Small number of detected syntenies between the Alligator/Turtle and Chicken. Much more syntenies with the Emu • Some problems with repeat elements • Very good correlation between the length of the BAC clones and the length of the hits • Possible identification of genes in Emu/Alligator and Turtle
And After ? • Looking at the mapping and the gene content more in details • Validation of new genes in the Emu BAC clones (Dan) • Work on the MHC (Zebrafinch with Chris B, Anolis with Ricardo)
Thanks!!!!! • Scott Edwards for the ideas and support • Andy Shedlock for the discussion and the help with repeat elements • You all for helping on newbie in biochemistry to look less ignorant