130 likes | 328 Views
Peter Rice and Mahmut Uludag. EMBOSS as an Efficient DAS Annotation Source. Peter Rice, EBI (pmr@ebi.ac.uk) Mahmut Uludag, EBI (uludag@ebi.ac.uk) 10th March 2009. European Molecular Biology Open Software Suite 1996: Started at Sanger Centre 2000: Release 1.0.0 and moved to HGMP
E N D
Peter Rice and Mahmut Uludag EMBOSS as an Efficient DAS Annotation Source Peter Rice, EBI (pmr@ebi.ac.uk) Mahmut Uludag, EBI (uludag@ebi.ac.uk) 10th March 2009
European Molecular Biology Open Software Suite 1996: Started at Sanger Centre 2000: Release 1.0.0 and moved to HGMP 2005: Moved to EBI (HGMP closed) 2008: Release 6.0.0 http://emboss.sourceforge.net EMBOSS: History
Open source package Sequence analysis 200 applications 100 third-party applications Reads 40 sequence formats Writes 40 sequence formats Reads 6 feature formats Writes 10 feature formats EMBOSS: Status
Over 100 interfaces / packages containing EMBOSS Command line Web interfaces GUIs SOAP Web services (EMBRACE) Taverna workflows Galaxy EMBOSS: Interfaces
EMBOSS produces annotations in DASGFF format Protein sequence referencing using Uniprot protein identifiers Nucleotide sequence referencing using Ensembl gene identifiers MyDAS based annotation server Executes EMBOSS programs based on the incoming requests Overview
pepcoil; predicted coiled coil regions in protein sequences patmatmotifs; motifs from the PROSITE database helixturnhelix; nucleic acid-binding motifs in protein sequences garnier; predicted protein secondary structures using GOR method sigcleave; predicted signal cleavage sites in protein sequences digest; protein proteolytic enzyme or reagent cleavage sites antigenic; predicted antigenic regions in protein sequences Protein sequence annotation,EMBOSS programs used so far
equicktandem, tandem; tandem repeats in nucleotide sequences silent; restriction enzyme sites in a nucleotide sequence which can be inserted (mutated) without changing the translation jaspscan; transcription factor binding sites from the JASPAR database marscan; matrix/scaffold recognition (MRS) signatures in DNA sequences restrict; restriction enzyme cleavage sites in nucleotide sequences tcode; protein-coding regions identified using Fickett TESTCODE statistic Nucleotide sequence annotation,EMBOSS programs used so far
26 EMBOSS programs producing graphical outputs Possibly using stylesheet support in Ensembl & DAS 13 EMBOSS alignment programs DAS 1.53E has alignment extension Other EMBOSS programsthat can be used for annotation
Dasty2; for protein annotations Good in displaying individual features Useful links for further exploration Links to ontology terms used Links to original DAS responses Ensembl; for gene and protein annotations Displays features in genomic context Possible to use DAS resources that not in the registry Test clients used
Need to register on dasregistry.org Experimental DAS server available at http://wwwdev.ebi.ac.uk/soaplab/das DAS servers as data sources Common coordinate systems Work in progress
The EMBOSS Team Peter Rice Alan Bleasby Jon Ison Mahmut Uludag