120 likes | 319 Views
PORTING HMMER AND INTERPROSCAN TO THE GRID. Daniel Alberto Burbano Sefair ( dburbano@uniandes.edu.co ) Michael Angel Pérez Cabarcas ( mic-pere@uniandes.edu.co ) University of The Andes Information Technology Division Colombia November 2008. Topics. Introduction HMMER InterProScan
E N D
PORTING HMMER AND INTERPROSCANTO THE GRID Daniel Alberto Burbano Sefair (dburbano@uniandes.edu.co) Michael Angel Pérez Cabarcas (mic-pere@uniandes.edu.co) University of The Andes Information Technology Division Colombia November 2008
Topics • Introduction • HMMER • InterProScan • What do we have? • What do we want with your help? • Questions
INTRODUCTION • Our users, from Biologic department, want to use HMMER and InterProScan by an easy way saving processing time. • Graphic User Interface instead of command line interface. • They are few users that submit many jobs (1000 - 3000). • Submit jobs with files upper than 10 MB. • Reduce the processing time using other computers. • Depend of the job, the time could be 1 h to 12 h. • Some jobs from InterProScan fail, and must be submited again.
HMMERProfile Hidden Markov Models • What is HMMER? - “HMMER is a sequence analysis tool using profile Hidden Markov Models”. - It is a set of 9 applications used by command line: hmmpfam, hmmsearch, hmmalign, hmmbuild, hmmconvert, hmmcalibrate, hmmemit, hmmindex, hmmfetch. The above definition is taked from: ftp://selab.janelia.org/pub/software/hmmer/CURRENT/Userguide.pdf Home page: http://hmmer.janelia.org/
HMMER 2. How can I use HMMER bycommand, PBS, and JDL? HMMER is a command line application, thisisanexample hmmsearch file.hmm MySequence.fasta >> output
InterProScan WhatisInterProScan? The followingdefinitionistakedfromEuropanBioinformaticInstitute: http://www.ebi.ac.uk/2can/tutorials/function/InterProScan.html “InterProscan is a tool that combines different protein recognition methods into one resource. It scans a given protein sequence against the protein signatures of the InterPro member databases (PROSITE, PRINTS, Pfam, ProDom, SMART, TIGRFAMMs.” Home Page:http://www.ebi.ac.uk/Tools/InterProScan/
InterProScan 4 The Usersubmit a proteinsequence. Proteinsequenceapplications are launched and searchagainstspecificdatabases. Eachapplicationreturns a list of hits. The results are combined. The informationreturnedto the user 2. HowdoesInterProScanwork? 1 2 3 Infomration and Sshema are takenfrom: http://www.ebi.ac.uk/2can/tutorials/images/scan_schema.gif
InterProScan 3. How can I use InterProScan by command, PBS, and JDL? InterProScan is a command line application, this is an example iprscan -cli –I input.seq -o test.out -format raw -goterms -iprlookup
What do we have? • Bioinformatic Grid Wrapper (BGW) for HMMER and InterProScan that is a Command Line Interface (CLI)
What do we want with your help? Architecture
Thanks ?
“Profile hidden Markov models (profile HMMs) can be used to do sensitive database searching using statistical descriptions of a sequence family's consensus. HMMER is a freely distributable implementation of profile HMM software for protein sequence analysis.”