1 / 24

INDIAN INITIATIVE FOR TOMATO GENOME SEQUENCING Tomato Finishing Workshop

INDIAN INITIATIVE FOR TOMATO GENOME SEQUENCING Tomato Finishing Workshop. T. R. Sharma National Research Centre on Plant Biotechnology Indian Agricultural Research Institute New Delhi -110012 trsharma@nrcpb.org. Tomato Genome Sequencing Project. Spain. USA. USA. Italy. France.

wramos
Download Presentation

INDIAN INITIATIVE FOR TOMATO GENOME SEQUENCING Tomato Finishing Workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INDIAN INITIATIVE FOR TOMATO GENOME SEQUENCING Tomato Finishing Workshop T. R. Sharma National Research Centre on Plant Biotechnology Indian Agricultural Research Institute New Delhi -110012 trsharma@nrcpb.org

  2. Tomato Genome Sequencing Project Spain USA USA Italy France Japan India N.land USA Korea China UK

  3. Sequence Type Capillary Sequencers ABI-3700 MegaBace-1000/4000 Collection of DNA Seq. data

  4. Data Flow at NRCPB

  5. Softwares Developed for Performing HTGS Analysis • rename - renames any number of files from ABI or MegaBACE generated format to St. Louis naming convention, • fsplit - splits a file containing multiple sequences in fasta format • fmerge - converts multiple fasta files into a single fasta file • coverage - calculates the depth of coverage of an assembly by the most stringent method • extract_reads – extracts all the reads from a particular contig or contigs in an assembly, • comhits - compares two blast outputs stored as text for common hit • confasta - converts a file of nucleotide sequences containing numbers and/or blank spaces into a sequence fasta file for doing BLAST search • format2xls - converts sequence fasta files to a tab delimited format • format2fasta - converts a database stored file into fasta format for further analysis • prefinish96 - an excel macro program which arranges templates in alphabetical order along with their custom primers in a 96 well format • prefinish384 - a similar excel macro program for template arrangement in 384 well format

  6. Sequence gap closer strategies for use

  7. Genome Sequences Types Submitted to GenBank A B E F H G C D Phase I 4 1 3 2 A B C D E F G H Phase II E E 1 2 3 4 Single clone area Gap Gap Single strand area Custom primers Multiple clone coverage on both strands E E 1 Phase III

  8. DNA Sequence Finishing

  9. Finishing DNA Sequences Finishing: is the process of polishing raw sequences, transforming the fragmented rough draft into long, continuous final product without breaks or errors. GOALS……….. • Resolve sequence ambiguities and discrepancies, such that the error rate is less than one in 10,000 bases. • Provide “double-stranded” coverage for every base: • minimum of two different clones • two different directions • two different chemistries • Achieve contiguity. • Delineate vector/insert junctions.

  10. Finishing DNA Sequences -How Scan assembly to pick linker clones for Tn Seq custom oligo dye terminator reverse dye terminator special chem (dGTP) reactions custom oligo for BAC DNA sequencing PCR amplification of problem areas Software used: Consed which is a graphical tool for viewing and editing sequence assembly data : chromat_dir, phd_dir, edit_dir

  11. Methods to resolve Seq. Gaps 1.Transposon method Linker clones • Identify linker clones • Perform trnasposon insertions • Transform DH10B cells • Pickup atleast 24 white colonies • Prepare template • Seq. all the templates • Add new Seq. data (New England BioLabs)

  12. Methods to resolve Seq. problems 2.Custom primer method Poor quality region Identify problem areas Custom primer Design primers Seq. at least 3 shot gun clones spanning to the region With same/different chemistry Add new seq. data - Editing

  13. Methods to resolve Seq. problems 3.PCR method Primers Contig 1 Contig 2 PCR amplification M 1 2 3 4 5 6 7 8 1 kb - Cleaning of PCR products Seq. of PCR products New reads Joining 2 contigs by PCR

  14. Sequencing Status, IITGS Phase 111 = 24 Phase 11 =25 Phase1 =10 Library =9 Total BACs Seq. = 68

  15. BAC clones in Phase III (IITGS) Total Seq.=1.168MB

  16. BAC clones in Phase III (IITGS) Total Seq.=1.283MB

  17. BAC clones on other Chromosomes / Redundant BAC Clones Total Seq.=631kb Total Seq.=3.082MB

  18. Examples of Problematic Regions

  19. Highly misassembled clone C05SLm0050C14

  20. Aligned region showing single base mismatch in C05SLm0050C14 consensus

  21. Approach to solve the misassembly in C05SLm0050C14 • Manually re-arranging reads on basis of: • Read-pair information of sub-clones. • PCR of different regions within the BAC to reconfirm assembly. • Digestion pattern of BAC obtained from six different restriction enzymes. • Sequence obtained after assembling individual sub-clones following transposition Current status of C05SLm0050C14 Region yet to be resolved

  22. Misassembly C05HBa0089M06

  23. A typical GC rich region

  24. ACKNOWLEDGEMENTS All Members of Indian Tomato Genome Sequencing Group and DBT for Financial Assistance

More Related