1 / 31

Assembling a shotgun sequenced BAC clone from Anopheles funestus genome

Assembling a shotgun sequenced BAC clone from Anopheles funestus genome. by Irene Kasumba, Faruck Morcos, and Jeffrey Spies Bioinformatics Computing University of Notre Dame. Goal of Project. Gene annotation of a BAC clone from the newly sequenced An. funestus genome.

corby
Download Presentation

Assembling a shotgun sequenced BAC clone from Anopheles funestus genome

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assembling a shotgun sequenced BAC clone from Anopheles funestus genome by Irene Kasumba, Faruck Morcos, and Jeffrey Spies Bioinformatics Computing University of Notre Dame

  2. Goal of Project Gene annotation of a BAC clone fromthe newly sequenced An. funestus genome. University of Notre Dame Bioinformatics Computing

  3. Genetic engineering/recombinant DNA technology: Methods developed to study genes in detail GENE CLONING Isolating a gene and producing many identical copies of it so that it can be studied in detail. CLONE GENE INTO A VECTOR University of Notre Dame Bioinformatics Computing

  4. Vector • A vehicle to transport DNA into a host cell (bacteria) and replicate DNA. • Eg. Plasmid and bacteriophages occur as natural circular DNA in bacteria • Vectors have: • An origin of replication • An antibiotic resistance gene • A selectable marker University of Notre Dame Bioinformatics Computing

  5. Cloning and Transformation University of Notre Dame Bioinformatics Computing

  6. BAC Clone Assembly Original DNA 150kb BAC clone (1 contig) Too big to be sequenced Break BAC into random fragments (8-10x coverage) University of Notre Dame Bioinformatics Computing

  7. BAC Clone Assembly Fragments differ in size (2-3kb) are sub cloned into a vector 3 1 2 Recombinant vector DNA is isolated from bacteria, then 600 bp from each end is sequenced Total of about 1760 clones were sequenced from BAC clone University of Notre Dame Bioinformatics Computing Slightly modified from Neil Lobo ppt

  8. Sequence using plasmid specific primers Forward primer Reverse primer Plasmid vector pHos2 University of Notre Dame Bioinformatics Computing Slightly modified from Neil Lobo ppt

  9. 1. Clip vector sequence from fragments Obtained FASTA FILE with 1760 sequences Clip the vector sequence – PHRAP or local alignment University of Notre Dame Bioinformatics Computing Slightly modified from Neil Lobo ppt

  10. 2. Assemble sequence fragments Tool used: PHRAP University of Notre Dame Bioinformatics Computing

  11. 3. Blast assembled sequence • Purpose: • Select the actual An. funestus sequences • How: • Blast (nr) all assembled sequences and eliminate non-mosquito sequences (i.e. human, vector, bacteria, etc.) • Which is An. funestus? Possibly unknown Blast result, probably the longest sequence because of 8 to 10x coverage University of Notre Dame Bioinformatics Computing

  12. 4. Gene prediction • GENSCAN • http://genes.mit.edu/GENSCAN.html • Change “Print options” to “Predicted CDS and peptides” • Fgenesh • http://www.softberry.com/berry.phtml • Select human, Drosophila and An. gambiae • GeneID • http://www1.imim.es/geneid.html • Select human and Drosophila University of Notre Dame Bioinformatics Computing

  13. GENSCAN University of Notre Dame Bioinformatics Computing

  14. Fgenesh University of Notre Dame Bioinformatics Computing

  15. GeneID University of Notre Dame Bioinformatics Computing

  16. GeneID ## source-version: geneid v 1.2 -- geneid@imim.es # Sequence AF1B_consensus_seq10_ctg3 - Length = 92604 bps # Optimal Gene Structure. 15 genes. Score = 66.16 # Gene 1 (Reverse). 1 exons. 78 aa. Score = 0.58 AF1B_consensus_seq10_ctg3 geneid_v1.2 Single 1308 1541 0.58 - 0 AF1B_consensus_seq10_ctg3_1 # Gene 2 (Forward). 3 exons. 162 aa. Score = 0.96 AF1B_consensus_seq10_ctg3 geneid_v1.2 First 2471 2684 -2.23 + 0 AF1B_consensus_seq10_ctg3_2 AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 4590 4803 3.53 + 2 AF1B_consensus_seq10_ctg3_2 AF1B_consensus_seq10_ctg3 geneid_v1.2 Terminal 9949 10006 -0.33 + 1 AF1B_consensus_seq10_ctg3_2 # Gene 3 (Forward). 3 exons. 297 aa. Score = 5.97 AF1B_consensus_seq10_ctg3 geneid_v1.2 First 11182 11564 4.65 + 0 AF1B_consensus_seq10_ctg3_3 AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 15006 15360 0.25 + 1 AF1B_consensus_seq10_ctg3_3 AF1B_consensus_seq10_ctg3 geneid_v1.2 Terminal 15421 15573 1.08 + 0 AF1B_consensus_seq10_ctg3_3 # Gene 4 (Reverse). 5 exons. 314 aa. Score = 5.72 AF1B_consensus_seq10_ctg3 geneid_v1.2 Terminal 22289 22526 3.12 - 1 AF1B_consensus_seq10_ctg3_4 AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 23735 23882 -0.43 - 2 AF1B_consensus_seq10_ctg3_4 AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 31511 31568 1.38 - 0 AF1B_consensus_seq10_ctg3_4 AF1B_consensus_seq10_ctg3 geneid_v1.2 Internal 37306 37576 2.40 - 1 AF1B_consensus_seq10_ctg3_4 AF1B_consensus_seq10_ctg3 geneid_v1.2 First 39378 39604 -0.74 - 0 AF1B_consensus_seq10_ctg3_4 # Gene 5 (Forward). 2 exons. 133 aa. Score = 2.22 AF1B_consensus_seq10_ctg3 geneid_v1.2 First 4089241118 1.49 + 0 AF1B_consensus_seq10_ctg3_5 . # Gene 15 (Reverse). 1 exons. 42 aa. Score = 0.47 AF1B_consensus_seq10_ctg3 geneid_v1.2 Terminal 9195292077 0.47 - 0 AF1B_consensus_seq10_ctg3_15 University of Notre Dame Bioinformatics Computing

  17. 5. Visualize overlap and select best predictions Use Wormbase to visualize overlap between predictions made by the different gene prediction programs: http://wormbase.org/db/seq/frend Parser: http://www.nd.edu/~jspies/bio/ University of Notre Dame Bioinformatics Computing

  18. Wormbase Visualization University of Notre Dame Bioinformatics Computing

  19. 6. Select “best” predictions University of Notre Dame Bioinformatics Computing

  20. 7. Blast predictions • Use Ensembl and NCBI • Blast proteins • nr,Drosophila, An. Gambiae • Use conservative scoring matrices (Blosum90) for within species Ensembl Blasts University of Notre Dame Bioinformatics Computing

  21. Gene Identity Determination Determine the identity/putative function of predicted genes in order to annotate possible genes in An. funestus University of Notre Dame Bioinformatics Computing

  22. Predicted Gene 12 Ensembl University of Notre Dame Bioinformatics Computing

  23. Ensembl (Dr) University of Notre Dame Bioinformatics Computing

  24. Ensembl Chromosome View (Dr) University of Notre Dame Bioinformatics Computing

  25. Ensembl (Ag) University of Notre Dame Bioinformatics Computing

  26. Ensembl Chromosome View (Ag) University of Notre Dame Bioinformatics Computing

  27. Blast Conserved Domains Uknown, but predicted gene gnl|CDD|16610 pfam00078, RVT, Reverse transcriptase (RNA-dependent DNA polymerase). University of Notre Dame Bioinformatics Computing

  28. 3D Structure of RVT University of Notre Dame Bioinformatics Computing

  29. Blast Hits gi|51950578|gb|AAA70222.2| putative ORF2 [Drosophila melanogaste 263 6e-68 gi|6635955|gb|AAF20019.1| pol-like protein [Aedes aegypti] 261 1e-67 gi|11323019|emb|CAC16871.1| pol [Drosophila melanogaster] 251 2e-64 University of Notre Dame Bioinformatics Computing

  30. Conclusions • Importance of bioinformatics tools in prediction and annotation of genes in a newly sequenced genome (e.g. An. Funestus) • Imperative to perform gene prediction using various programs - provides more credible biological insight University of Notre Dame Bioinformatics Computing

  31. Thanks to Neil Lobo. University of Notre Dame Bioinformatics Computing

More Related