260 likes | 718 Views
Locating markers from the genetic to the physical rice map. Review: Linkage map of molecular markers (Genetic maps). Shows order & map distance of various molecular markers along a linkage group (chromosome) Generated by linkage analysis. SSR marker. RFLP marker. Contig Assembly. “gap”.
E N D
Review: Linkage map of molecular markers (Genetic maps) • Shows order & map distance of various molecular markers along a linkage group (chromosome) • Generated by linkage analysis SSR marker RFLP marker
Contig Assembly “gap” 1 contig A physical map of the rice genome that is aligned with genetic maps is available Genetic map Cornell Physical map
Terms and facts about the physical map • Chromosome = assembly of contigs • Contig = assembly of clones • Clone – the only physical entity (BAC or PAC clone) • A veeerrry small chromosome piece sequenced; 100++ kb DNA • Assigned a GenBank accession ID • Gap = introduced by • Recombination • No data (assembly unfinished or data non-existent yet) • Anchor marker – molecular marker that is • physically located on a clone (sequence is there) • has an assigned cM location • 1 cM = ~247kb (from 420Mb japonica genome / 1700 cM total)
Molecular markers from genetic map can be located into physical map Genetic map Cornell Clone AL772426 Anchor marker C460
Locating a marker in the physical map enables you to.. • Determine gene(s) in silico that may be responsible for the phenotypic effect of a marker • Gene structures in the region of clone where marker sequence can be found • Identify more markers that can be used for fine mapping • Use existing markers • Design STS primers = new markers
Where are these map databases??? • Gramene • http://www.gramene.org/ • TIGR rice genome map • http://www.tigr.org/tdb/e2k1/osa1/BACmapping/description.shtml • Arizona Genomics Institute rice physical map • http://www.genome.arizona.edu/fpc/rice/ • RGP (Japan) • http://rgp.dna.affrc.go.jp/
Sample problem 1 • You mapped a gene for salt tolerance (SALTY) 5 cM away from marker RZ649 in chromosome 5. What SSR marker(s) can you use near this region? • Use Gramene – comparative maps resource::Cornell SSR map
Gramene output • Further exercise: • Which clones contain these SSR markers? • How far apart (in cM) are these clones?
Sample problem 2 • You have a clone of a putative salt-tolerance gene, sequence known (SALTY) • Candidate gene approach • You want to map it by RFLP/SSR combination • Strategy • Locate the sequence in the rice genome in silico for better targetting (BLAST is a good tool) • For sequences > 50 nt, ~50% identity = hybridization signal • Locate SSR/RFLP markers flanking the “location” • Gramene or other databases mentioned • Additional Tips • Also use markers anchored to clones • Use standard-spaced markers for good genome coverage
Step 1. Locating the clone • Find similar sequences in the rice genome (IRGSP) using BLAST • Predict the genes in the clone in the region where there is significant BLAST hit using FGENESH • Ensures that you are hitting a functional gene • Identify the gene • Search the Protein Family database for homologous proteins with the predicted gene using HMMPFAM • Ensures you are working with the target gene of interest
Gene as a data text file • FASTA Format – the most common format • Loosely formatted text file containing a descriptor line & the sequence data • Saved as a text-only file • Best to use Notepad or a text-only editor • Most sequence database centers offer this option
Sample FASTA file >SALTY gene| Oryza sativa putative salt tolerance gene mRNA,complete cds TTCTCTCTCTCTCTCTTCTTCTTCTTCTTCTTCTCCATATCTCCTACTCCTCGTGAAGATCGATCGACCATCGGCAATTT CATTCGGTAATAGTTAAGCTAAGATCAAATCAAGATTGGCGAAACGATGGAGATGGTGCTGCAGAGGACGAGCCACCACC CGGTGCCCGGGGAGCAGCAGGAGGCGGCGGCGGAGCTGTCGTCGGCTGAGCTCCGGCGAGGGCCGTGGACCGTCGACGAG GACCTCACCCTCATCAATTACATCTCTGATCACGGCGAGGGCCGCTGGAACGCACTCGCACGCGCCGCCGGTCTGAAGAG GACTGGGAAGAGCTGCCGGCTCCGGTGGCTGAACTATCTCCGGCCGGATGTGAAGCGCGGCAACTTCACCGCAGAGGAGC AGCTGCTCATCCTCGACCTCCACTCCCGATGGGGCAACCGATGGTCCAAGATAGCACAACATTTGCCTGGGAGGACCGAC AACGAGATCAAGAACTACTGGAGGACCAGAGTGCAAAAGCATGCCAAGCAACTCAATTGTGATGTCAACAGCAAGAGGTT
Database search using Basic Local Alignment Search Tool (BLAST) • Most popular sequence alignment tool available • Similarity/Homology ~ Alignment • BLAST hist significance is quantified by various parameters • Alignment for the Maximal-scoring Segment Pairs are reported • 6 Different BLAST programs from NCBI • BLASTN, BLASTP, BLASTX, TBLASTN, TBLASTX • Usually we use BLASTN
BLAST Score & its Statistical Significance • One alignment has a score , S, associated • local random alignments are given a probability density function named extreme value distribution • When you relate an observed alignment score (S) to the EVD, you can calculate statistical significance known as E value • E value is the number of alignments with scores = S that would be expected by chance alone • Lower E value, higher MSP match
Where to BLAST • Web-based – one to tens of sequences • NCBI • TIGR –http://tigrblast.tigr.org/euk-blast/index.cgi?project=osa1 - my favorite • IRRI Local computer, via command line • Palay Alphaserver • Good for heavy-duty searches
Gene Prediction • FGENESH – predicts multiple genes in genomic DNA sequences (Solovyev 2001) • Used by the rice genome sequence authors (BGI, TMRI) • Available in web server http://www.softberry.com/berry.phtml?topic=gfind&prg=FGENESH and command line version (Palay)
FGENESH command line $ fgenesh /usr/local/fgenesh/Monocot seqfilename > outfilename
Identify the gene: Search the established protein database for homology • PFAM database HMM search • Web – based thru the TIGR site • http://tigrblast.tigr.org/web-hmm/
Locate the surrounding SSR markers in the region • BLAST reports the accession ID where you have significant hit(s) • Get more information on this clone from Gramene or AGI