260 likes | 267 Views
GenSAS is a user-friendly web platform for the annotation of model and non-model organisms. It offers easy-to-use interfaces, secure user accounts, and collaborative annotation projects.
E N D
A web-based platform for structural and functional annotation of model and non-model organisms www.gensas.org Jodi Humann, Taein Lee, Stephen Ficklin, Chun-Huai Cheng, Heidi Hough, Sook Jung, Jill Wegrzyn, David Neale, Dorrie Main jhumann@wsu.edu
What is genome annotation? ???? Predicted gene models to use in lab experiments Annotation
What is GenSAS? • Web-based platform, no software installation by user • Just need a user account, internet browser, and an internet connection • User accounts keep data private and secure and allow for collaborative annotation projects • Easy-to-use interfaces and detailed user manual
Account Limits • User accounts will remain active as long there is an active project • Projects expire after 60 days unless user resets expiration date • 250 GB of storage space on server • Assembly files must be high quality • <25,000 sequences • Over 50% of sequences longer than 2,500 bases • Seven jobs running at one time, but other jobs can be waiting in queue
User provided files • Required: • Genome assembly • Optional: • Assembled transcripts or ESTs • Species-specific repeats or proteins • Species-specifcGenbank gene structures • Filtered Illumina RNA-seq reads • Aligned RNA-seq reads in the BAM file format • Previous annotations in the GFF3 format
GenSAS provided information • RepeatMasker: • Repbase repeat libraries • Transcript and protein alignment tools: • NCBI RefSeq transcripts and proteins • archaea, bacteria, fungi, invertebrate, mitochondrion, plant, plasmid, plastid, protozoa, vertebrate-mammalian, vertebrate-other, viral • SwissProt • Trembl
GenSAS Homepage • Request free account • Login to GenSAS • Access User’s Guide and contact us • Learn about tools and libraries • Access the GenSAS interface
GenSAS Interface Once jobs are in queue, users can log out of GenSAS
Sequences Step • Once uploaded, assembly metrics are calculated using PRINSEQ • Users can run BUSCO on assembly
Project Step • Fillable web form • Select previously uploaded assembly • Email options
Repeats and Masking Steps Masking step produces consensus, or can skip masking
Consensus Step • Optional step using EVM • Can adjust and remove weights Gene Predictions Protein Alignments Transcript Alignments
OGS Step Select “Official Gene Set”
Refine and Functional Steps Optional step to further refine OGS using PASA prior to functional annotation
Annotate Step Edits added to “User-created Annotations” will be merged into final results
Publish Step • OGS and repeat consensus automatically prepared • FASTA and GFF formats • User can select other jobs
Final Annotation Results • Summary table of annotation project • Project Summary file with details about tool settings • Option to create merged GFF3 file • Add repeats, tRNA, rRNA • Add functional job annotation to column 9
Final Annotation Results All results files are listed and can be downloaded individually or….
Final Annotation Results • Use “Download all” option to get all the files at once • Option to run BUSCO on proteins from final annotation
www.gensas.org Funding GenSAS Poster – PO0085