100 likes | 264 Views
NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation. April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB. Workflow for Today. Prepare to visualize annotation Get a genomic sequence from Genbank Repeat mask it. Retrieve a genomic sequence….
E N D
NGS Bioinformatics Workshop1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB
Workflow for Today • Prepare to visualize annotation • Get a genomic sequence from Genbank • Repeat mask it.
Retrieve a genomic sequence… • Retrieve a (relatively small <100kb, eukaryote) genomic sequence clone from Genbank • Query Nucleotide divisione.g. Arabidopsis BAC clone (HE601748.1) • Select FASTA • Save.. To File.. As “Fasta” (rename?)
Blast is a low hanging fruit… • Use BLAST to quickly survey for similar sequences • Megablast against nucleotide • e.g. HE601748 is closest to A. thaliana chr. 5? • Megablast against reference RNA sequence db
Repeat Masking • Upload the clone file to RepeatMasker on the web and run with appropriate parameters: http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker • Save the results (including the masked sequence) to your computer
ab initio Gene Predictions • Genscan: http://genes.mit.edu/GENSCAN.html • Cut and paste results as text to a file • Fgenesh: www.softberry.com
Blast2GO http://www.blast2go.com • Annotation workbench, via Gene Ontology (GO) terms. • First, save the predicted peptides (e.g. from fgenesh) • need to fix the FASTA headers to assign proper identifiers (could write a script?) • (Java web) start blast2go workbench • Load in peptides • Do the analysis… e.g. run blastp, GO, annotation, Interpro, etc. • See www.geneontology.org for details on GO • http://www.ebi.ac.uk/interpro/ for interpro info
EMBOSS • European Molecular Biology Open Software Suite (EMBOSS): http://emboss.sourceforge.net • Download and install version of interest (e.g. Linux, Mac OSX, Windows…) • Decide what do to: http://emboss.sourceforge.net/apps/groups.html • Let’s try a CpG island plot (cpgplot)
Study Genes by Comparative Genomics • JGI Vista toolkit: • http://genome.lbl.gov/vista • GenomeVista • rVista