1 / 29

Progress on the sequencing of tomato chromosome 6

Progress on the sequencing of tomato chromosome 6. Roeland van Ham, Sander Peters, Taco Jesse, Hans de Jong, Erwin Datema, Rene Klein Lankhorst. Outline. Project overview Results mapping and FISH Sequencing status & planning Annotation (Test BAC annotation). Dutch tomato chr. 6 sequencing.

quinta
Download Presentation

Progress on the sequencing of tomato chromosome 6

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Progress on the sequencing of tomato chromosome 6 • Roeland van Ham, Sander Peters, Taco Jesse, Hans de Jong, Erwin Datema, Rene Klein Lankhorst

  2. Outline • Project overview • Results mapping and FISH • Sequencing status & planning • Annotation (Test BAC annotation)

  3. Dutch tomato chr. 6 sequencing • Centre for BioSystems Genomics (CBSG) • developments 2004-2005: • only limited number of initial seed BACs anchored (23) • anchoring problems revealed by FISH analysis • reduction in sequencing costs • sequence more BACs with same budget • novel SOL resources: • additional BAC libraries (Mbo I and EcoRI) • BAC-end sequences • new genome-wide selection of seed BACs from Cornell F2.2000 genetic map • ~70 candidate seed BACs for chr. 6

  4. CBSG tomato chr. 6 sequencing goals adjusted: • (originally draft sequence 10Mb) • anchoring of seed BACs over entire euchromatic part • chromosome walking from seed BACs by STC/AFLP fingerprinting approach • special regions of interest remain: Mi, Cf2/5, Ol1/3 loci • sequence complete euchromatic part of chr. 6 (~20.5 Mb) • estimated number of BACs: ~215

  5. Dutch tomato chr. 6 sequencing effort • Centre for BioSystems Genomics (CBSG) and associated projects (EU-)SOL resources InF1 infrastructure TS1 mapping BAC selection (TS3) FISH TS2 sequencing InD1 assembly & annotation technology development bioinformatics projects

  6. Mapping results • Selection of extension BACs: a chromosome walking pilot (Peters et al. Plant Phys, 2006)

  7. Mapping results • Approach Peters et al. Plant Phys, 2006

  8. Analysis of candidate extensions BACs:BlastN & Assembly

  9. Analysis of candidate extensions BACs:AFLP and FPC smallest overlap, largest extension: sequencing pipeline

  10. FISH results • TS3 (Ludmila Khrustaleva, Hans de Jong) • 8 out of latest set of 12 seed BACs unambiguously positioned

  11. FISH Results chr. 6 sequencing

  12. FISH pipeline chr. 6 sequencing • list of new candidate seed BACs (AGI) analyzed for: • marker or marker in reliable contig • availability of good BAC-end sequences • presence of repeats in marker or in associated BAC-end sequence • 51 candidates analyzed • 24 new seed BACs selected • currently in FISH pipeline

  13. FISH pipeline chr. 6 sequencing • multi-FISH experiment in preparation • determine relative physical position of seed BACs • do we have cross seed BAC oceans?

  14. Sequencing status chr. 6 & planning Results • BACs finished to Phase 1-2 / 353 / 2 (14 ext. BACs) • BACs in sequencing pipeline5( 4 ext. BACs) • ready for sequencing3 • new seed BACs in FISH pipeline24 Planning • data release: 15 BACs at SGN, 24 to be released from August 1st (CBSG partner approval pending) • start phase 3 sequencing (gap closure) in Q4 2006 (EU-SOL) • 454 BAC sequencing pilot underway • extension BAC selection from seed BACs (SNaPshot FP)

  15. CBSG tomato chr. 6 sequencing: Annotation • project connections InD1 assembly & annotation technology development • bioinformatics • projects • structural annotation and curation chr.6 • functional annotation:bayesian gene function prediction • alternative splicing • miRNA prediction

  16. Results chr. 6 sequencing • InD1: development of software • TOPAAS: genome assembly & extension BAC selection • Cyrille2: system for automated, high-throughput genome annotation • CBSG genome sequence database

  17. end user databases annotator & admin core software cluster linux & condor third-party tools e.g. blast, interpro, genscan Cyrille2: system overview (1)

  18. pipeline database status database biological database end user annotator & admin user interface scheduler executor cluster linux & condor third-party tools e.g. blast, interpro, genscan Cyrille2: system overview (2)

  19. upload sequence s gene prediction g g blast b b b Cyrille2: data storage & transport • BioMOBY • easy interaction with 3rd party servers <moby:MOBY> <moby:mobyContent> <moby:mobyData moby:queryID='data'> <moby:Simple> <moby:GenericDnaSequence moby:id="073H08F00068"> <moby:Length>2332</moby:Length> <moby:Sequence> AATCGACGATCTACGTA.... </moby:Sequence> </moby:Integer> </moby:GenericDnaSequence> .....

  20. biological database get from database pipeline database data pointer cyrille2 core node tool wrapper tool status database store in db pointer data biological database Cyrill2: job execution cluster / biomoby service BioMOBY

  21. Cyrille2: BAC annotation pipeline • Ab initio gene predictors • Genscan (Arabidopsis) • GlimmerHMM (Arabidopsis) • GeneId (Solanaceae) • SNAP (Arabidopsis) • under development: JIGSAW (consensus gene modelling) • Other feature predictors • Marscan (EMBOSS) • Tandem Repeats Finder • RepeatMasker (tomato-specific library) • miRNA • InterPro • under development: functional annotation • Transcript datasets (blastn -> Sim4) • SGN tomato UniGenes • SGN potato UniGenes • TIGR LeGI TCs • Kazusa microtom UniGenes • Genbank full-length cDNAs (filtered) • SGN Coffee UniGenes • Protein datasets (tblastn -> GeneWise) • Swiss-Prot Plant • Arabidopsis TAIR6 annotation

  22. Cyrille2: pipeline programming

  23. Cyrille2: pipeline programming genome annotation pipeline miRNA & target prediction pipeline

  24. Cyrille 2: summary • fully automated, high-throughput • generic bioinformatics workflow management • modular, extensible • generic tool wrapper module • open communication standard • BioMOBY, access to external services • iterative execution • background execution • automated updating • database independent (GGB / Ensembl) • independent GUI

  25. ggb visualization • tomato and potato genome annotations • storage, access, visualization • http://appliedbioinformatics.wur.nl/cbsg-site • Sept. 1st public access to released data

  26. ggb visualization

  27. Test BAC annotation Erwin Datema

  28. Conclusions • ~26% (5.5 Mb out of 20.5 Mb) euchr. part draft sequenced • BAC walking strategy successful, continue with SNapshot FP • start closure sequencing BACs Q4 2006 (EU-SOL) • assessment of physical distribution of current set of seed BACs by FISH • improve, deepen and curate structural annotation • integrated in EU-SOL

  29. Yuling Bai Song-Bin Chang Erwin Datema Mark Fiers Mark van Haaren Jan van Haarst Marleen Henkens Thamara Hesselink Taco Jesse Hans de Jong Ludmilla Khrustaleva Pim Lindhout Bas te Lintel Hekkert Fien Meijer Sander Peters Marjo van Staveren Willem Stiekema Keygene NV PRI; Applied Bioinformatics/Greenomics WU; Genetics WU; Plant Breeding Acknowledgements • Rene Klein Lankhorst

More Related