450 likes | 462 Views
Join us for a guided tour of DNA Subway and learn to analyze genetic data of plants and animals. Includes hands-on practice and post-lab worksheet. Get ready to dive into the world of DNA barcoding!
E N D
DNA Barcoding Analysis in DNA Subway Genus? species? DNA Subway Blue Line analysis of plant sample *Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/
Recap Analyze sequence data using the DNA Subway Handy video resources: https://www.youtube.com/watch?v=7WF--Ba2P10 https://www.youtube.com/watch?v=LNkv_UBGIZw Image credit: Centre for Biodiversity Genomics *Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/
Today • Guided tour of DNA subway • Analyze a practice data set as a class • Analyze your own sequence data working individually • Record main results in your Canvas lab notebook • Answer questions in postlab worksheet Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Obtain Practice DNA files from instructor Forward read 3’ 5’ 5’ 3’ Reverse read Genus? species? These two files represent a single specimen. Why are there two files for 1 specimen? Region we want to Barcode Hyman et al. CURE-all: DNA barcoding in introductory biology
DNA Subway Blue Line Sequence Analysis plant animal prokaryote fungi • Sign in to your account and click on the Blue Square to start a Barcoding project. • Choose a new rbcL Barcoding project • Name project “last name_specimen type_BIO140_Sec_????” • Upload both AB1 practice sequence files that you got from Canvas – hold down command key (Mac) or control key (PC) to select both files at the same time COI What kind of “project type” would we start if we were analyzing invertebrate data?
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ The Blue Line “Stops” in the Blue Line are streamlined for easy analysis of DNA sequences 3 “main” stops sub stops
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ First Stop: Sequence Assembly Goal: clean up your sequence data to make the longest, most accurate sequence possible • 4 main steps: • View sequence • Trim Sequence • Pair F and R Sequences • Build Consensus
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 1: Sequence Viewer Goal: Does my sequence look “good”? Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 1: Sequence Viewer How long is our forward read? • Chromatogram (also called trace file) view gives you: • Base call (top) • Quality score: blue line (20) is threshold for “good” quality • Sequence length (intervals of 10 base pairs) • Fluorescence trace indicating which ddNTP was incorporated
What does good sequence look like? Great Sequence Good Sequence Bad Sequence Horrible Sequence
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ How do I know if my sequence is high or low quality? BAD GOOD
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 2: Sequence Trimmer • Goal: Trim off low quality base calls from 5’ and 3’ ends of reads Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 1: Sequence Viewer Goal: Does my sequence look good? Forward read 3’ 5’ 5’ 3’ 5’ 3’ Reverse read 5’ 3’ • Extreme 5’ and 3’ ends of sequence are low quality (Ns) Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 2: Sequence Trimmer • Trims all low quality base calls off of 5’ and 3’ ends of reads
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 2: Sequence Trimmer • How long are each of our trimmed sequences? Forward ~540 bp Reverse ~550 bp
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 3: Pair Builder • Goal: links and aligns your forward and reverse reads Forward read 3’ 5’ Reverse read 5’ 3’ Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 3: Pair Builder • For the Reverse sequence (with an R in the name), click on the letter ‘F’ to change to the sequence to Reverse orientation • Check the boxes to Right of letters to select F & R pairs • -Save your pairs Hyman et al. CURE-all: DNA barcoding in introductory biology
Step 4: Consensus Builder *Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ • Goal: combine F and R pairs to create longer, more accurateconsensus sequence Forward read 3’ 5’ Consensus 5’ Hyman et al. CURE-all: DNA barcoding in introductory biology 3’ Reverse read
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 4: Consensus Builder How long is our consensus? How much longer is this than our longest trimmed read? 582-550 = 32 bp Extra information in our Barcode! increased accuracy • Combines the reads that you manually designated as F and R pairs creating consensus sequence
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 4: Consensus Editor • Can trim your consensus sequence (but don’t!) • Can edit the Name (I recommend doing this) – click on Save to keep the new name Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ 1st Stop Recap: Sequence Assembly Goal: make the longest, most accurate sequence Forward read 3’ 5’ • 4 main steps: • View DNA sequence(s) – allowed us to visualize the accuracy of each base pair in our “reads” • Trim Sequence – got rid of “bad calls” • Pair Sequences - linked forward and reverse reads • Build Consensus – compared and combined forward and reverse reads to get longer, more accurate DNA sequence Reverse read 5’ 3’ Forward read 3’ 5’ Consensus Reverse read 3’ 5’ = longest, most accurate barcode
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ 2nd Stop: Adding Sequences Goal: Find DNA sequences that closely match your barcode and select some to build a tree • What do you need to build a phylogenetic tree from barcodes? Known barcodes of closely related species Known barcodes of distantly related species for comparison (outgroup)
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 1: BLASTN Goal: Find known DNA sequences that most closely match your barcode BLAST searches your consensus sequence against a national database of known sequences (it’s like a Google search for DNA sequences) Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 1: BLASTN • BLAST searches your paired sequence against a huge database of known sequences • Click on BLAST to start the search, then results will pop up Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 1: BLASTN Results (“Hits”) • Each sample will result in a list of database hits ranked by Bit Score • Score is calculated based on: • -alignment length (length of sequence used to make match) • -mismatches (# of mismatched bases between query and hit) • Expected value (e) is the likelihood that the match BLAST made between our query sequence and your match could happen by chance
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 1: BLASTN Results • A couple rules of thumb • If the (# of mismatches/Aln. Length) > 1%, this often means the match is a different species (though this is highly variable across taxonomic groups and hotly debated) • If your Aln. Length is less than 400 bp, your data are often not reliable at least to ID at the species level (though this is highly variable across taxonomic groups and hotly debated)
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 1: BLASTN Results Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 1: BLASTN Results • Select the top 7 taxonomically unique (different species!) BLAST hits. • Select “Add BLAST hits to project” • You can’t choose again, so choose carefully!
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 2: Reference Data Goal: Find known DNA sequences that can serve as an outgroup Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Step 2: Reference Data • Select reference group you want (Common plants in this case) • Select “Add ref data” Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ 3rd Stop: Analyzing Sequences Goal: Choose DNA sequences to compare. Create a “Multiple sequence alignment” to visualize their relationships Create phylogenetic trees of these sequences to visualize their relationships Helps to ID your specimen Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Analyzing Sequences: 1st Step Select Data to compare Hyman et al. CURE-all: DNA barcoding in introductory biology
Analyzing Sequences: 1st Step *Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ (Unknown sample) • Select samples to compare DNA Barcodes of related taxa • -select 1 of your samples (from user data) • -select the BLASTN hits that you added to the project • -select a singleoutgroup from your reference data (Ex. Ginko) • Save Selections
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ 2nd Step: Sequence Alignment (MUSCLE) • MUSCLE algorithm will align the sequences you selected. Click MUSCLE • Click on MUSCLE again to view the alignment Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ 2nd Step: MUSCLE DNA Barcode Alignments • View DNA Barcodes of your sample compared to similar taxa • Colored vertical lines indicate ATCG polymorphisms (differences) between Barcodes • Trim alignments to cut off 5’ and 3’ ends that don’t line up
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ 2nd Step: MUSCLE DNA Barcode Alignments • Hit ATCG button in upper left to see letter codes for nucleotides • Use controls in upper left to zoom in/out or down to nucleotide level • Colored vertical lines or letters indicate ATCG polymorphisms between Barcodes. How many polymorphisms at the site marked with the arrow? 2
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ 2nd Step: MUSCLE DNA Barcode Alignments • Colored vertical lines indicate ATCG polymorphisms between Barcodes Which taxon is least similar to ours? Which taxon is most similar to our unknown? Sequence similarity matrix can help!
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ 2nd Step: MUSCLE DNA Barcode Alignments Which taxon is most divergent? Which taxon is most similar? #9 Dicentraformosa #1 Ginko
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Last Step: Phylogenetic Trees • Build Neighbor joining (NJ) or Maximum Likelihood (ML) trees to visualize sequence relationship between selected samples • -NJ trees are simpler but less accurate • -ML trees are more complex but more accurate Hyman et al. CURE-all: DNA barcoding in introductory biology
*Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/ Last Step: NJ Phylogenetic Tree • Click on PHYLIP NJ • Select outgroup from dropdown Which species is most closely related to our unknown? #9 Dicentraformosa How did we know to use Ginko as the outgroup? It was the least similar in our % similarity matrix
Last Step: ML Phylogenetic Tree (click on this) Which species is most closely related to our unknown? Dicentraformosa Which species is the least related? Ginko • Horizontal lines represent sequence differences (evolutionary change) • Internal nodes represent a common ancestor shared by taxa on tips • The longer the horizontal branch between internal nodes and the tips, the larger the amount of evolutionary change Hyman et al. CURE-all: DNA barcoding in introductory biology *Note: all DNA Subway images used with permission from the Cold Spring Harbor Labs DNA Learning Center https://www.dnalc.org/ and DNA Subway https://dnasubway.cyverse.org/
So what is our unknown species? Dicentraformosa Lamprocapnosspectabilis Dicentraeximia What evidence do we have? Our Unknown %S.S. – Dicentraformosa BLAST – Dicentraformosa NJ – Dicentraformosa ML – Dicentraformosa %Sequence similarity is over 99% (beats 1% cut-off) Aln. Length is 562 bp (beats 400 bp cut off) Google search: Looks like the thing in the photo Wikipedia: range – D. eximiais the species native to VA. D. formosais west coast, but commonly used in gardens! D. formosais also commonly confused with D. eximia, and often sold under that name.
Follow Directions on Pg. 2 • Download your F + R sequences from Canvas (ab1 files) • If you don’t have sequence choose one from the “Unknowns” link on Canvas • be sure to look at the photo in the folder to know what primer set to choose • Create a Blue Line project and analyze your data using the instructions from the lab manual • Fill the required information into your Canvas notebook • Take screenshots for lab notebooks • Submitting a lab notebook entry as individuals – due at the end of class today! • Answer postlab questions in the lab manual • Due at the end of class today! Hyman et al. CURE-all: DNA barcoding in introductory biology