360 likes | 692 Views
Introducing EMBOSS/ Jemboss European Molecular Biology Open Software Suite Dr. Erik Bongcam-Rudloff. History. In the beginning was EGCG. 1988 - EGCG is started to provide extensions to the GCG package. Late 1990's - GCG/EGCG is the de facto standard sequence analysis package worldwide.
E N D
Introducing EMBOSS/Jemboss European Molecular Biology Open Software Suite Dr. Erik Bongcam-Rudloff
History In the beginning was EGCG 1988 - EGCG is started to provide extensions to the GCG package. Late 1990's - GCG/EGCG is the de facto standard sequence analysis package worldwide. EGCG sought to support the needs of major sequencing initiatives such as the human genome project. Oxford Molecular (then commercial owners of GCG) close access to the program code preventing further development of EGCG past version 8.1 EGCG is used by up to 10,000 users at 150 centers as an addition to the GCG package.
The Birth of EMBOSS Spurred by the continuing demand for new sequence analysis programs the EMBOSS project is started. Core development is funded by the UK research councils and the Welcome Trust as part of their commitment to the Human Genome Project The experience of the EGCG team is used to write an entirely new package from the ground up. EMBOSS has been licensed as 'Open Source' to ensure continued access to the program code. This prevents anyone from taking your programs away. EMBOSS has been designed from scratch by scientists for scientists, so it can readily be integrated with the web or other packages.
EMBOSS today Core libraries of routines for sequence manipulation, database access, and so on are available. These libraries are prewritten functions that any programmer can use. They cover simple things like extracting subsequences to complex things like sequence alignments and comparisons. These make writing new programs much easier. More than 80 programs have been written, replacing greater than 90% of the functionality of GCG and adding many functions you will find in no other package. Programs are being contributed at an impressive rate from all over the world and EMBOSS is installed in many laboratories worldwide. Open source means that you have permission to modify and customise the programs to do what you need, without constraint.
EMBOSSpresent and future EMBOSS under development Training courses and documentation These are being actively developed by users and EMBnet. Graphical/Web interfaces. Now the initial EMBOSS release is stable, graphical interfaces are being developed Web-based: W2H, Pise and others Java: JEMBOSS Your own programs Writing an EMBOSS application is quick and easy for a C programmer.
Comparing EMBOSS and GCG • Some examples: • DISTANCES -PHYLIP package • EXTRACTPEPTIDE -transeq • MAP -Restrict Remap • MOTIFS -Patmamotifs • PEPDATA -getorf
Using EMBOSS All EMBOSS programs can be run from the command line. There is no need to specifically initialise EMBOSS. By default EMBOSS programs will not ask you lots of questions, just the minimum needed to run the program. You can specify everything with options or have EMBOSS prompt you for the inputs to the program You can get help on any program with the '-help' option on the command line. If you put the '-opt' option on the command line then EMBOSS will ask you for more detailed options. This will list all the inputs a program needs in order to run.
Fully GPL No purchase necessary EMBOSS Instant bioinformatics! Just add science and 'make'! Writing EMBOSS programs Three steps to a new program: 1. Write the ACD file to describe the input to your program. 2. Write the program code to initialise your program in EMBOSS using the templates provided. Retrieve the parameters. You can test that you have your program described correctly with the command 'acdc' 3. Now just add the science. Write the code to do the manipulations you need. int param1; void main(int argc, char * argv) { embInit("program",argc, argv); param1 = ajAcdGetInt("param1"); ... EMBOSS has many common bioinformatic functions in the AJAX and NUCLEUS libraries.
Interfaces • Web • EMBOSS- W2H • PISE • EMBOSSS-GUI • X-Windows • STADEN- SPIN, (+ others coming) • Ssh/xterm/Character-based • emnu
Web interface details • Many are being developed: • W2H (http://www.hgmp.mrc.ac.uk/Registered/Webapp/emboss-w2h/) • Pise details (http://www-alt.pasteur.fr/~letondal/Pise/) • wEMBOSS (http://liv.bmc.uu.se/EMBOSS)
X-Windows interfaces • At least three are being developed: • Spin (Staden package) • Kaptain (http://userpage.fu-berlin.de/~sgmd/) • Arka (http://www.bioinformatics.org/genpak/)
EMBOSS/Jemboss • Jemboss is the new Graphical User Interface (GUI) to EMBOSS, designed to facilitate the use of programs. It is written in the programming language Java, enabling the interface to be used in both PC and UNIX environments.
EMBOSS/Jemboss • The older Mac platform does not support this GUI, and only Macs running MacOS X can also run Jemboss. • Web-start installed by default
EMBOSS/Jemboss • The interface has been written at the HGMP-RC in collaboration with the EMBOSS team • First release January 2002
EMBOSS/Jemboss • A web launch tool (Java Web Start) must be installed on the client (i.e. user's computer) before Jemboss can be accessed • to allow this Java program to be downloaded and launched from the web
EMBOSS/Jemboss • The Jemboss server has been installed under linux, AIX, MacOSX, irix, Solaris and HP-UX. • The server setup is very much dependent on the local environment and the level of security necessary for a site.
EMBOSS/Jemboss • It is possible to set up a basic non-authenticated and non-encrypted server. • This may be suitable for sites in which the server is only available internally. • A more secure server can be set up which uses SSL for data encryption.
EMBOSS/Jemboss • SOAP is used to communicate between the client and the server, • Apache-Tomcat is used to deploy the Jemboss services.
EMBOSS/Jemboss • And now all this in practice!
Concluding remarks • If you want to install • Central server with system manager • Pros and Cons of the EMBOSS package
The EMBOSS-Coktail • jakarta-tomcat-*.tar.gz • SOAP (Simple Object Access Protocol) • Apache-x.x.tar.gz • Libpng-tar.gz • Z-lib.tar.gz • EMBOSS-2..x.x.tar.gz • The latest Java
EMBOSS minus • The major deficiencies in the EMBOSS package are: • BLAST, FASTA, ASSEMBLY • You should use the publicly available software: • Blast - NCBI, HGMP, many other sites • Fasta - HGMP • Assembly - Staden package
EMBOSS plus • Much effort is put into removing arbitrary limits.E.g. Max. sequence length: 2Gb • Many programs limited only by available memory • Source code available for inspection, change and writing your own programs • EMBOSS is FREE! • GNU Public License • Open Source Software
THE END • Questions?