130 likes | 286 Views
SMBL and Blast. Joe Rinkovsky Unix Systems Support Group Indiana University. Introduction. IU has around 2000 Windows PCs in public Student Technology Centers Condor is used to harvest unused cycles Simple Message Brokering Library(SMBL) used for parallelizing applications on Windows
E N D
SMBL and Blast Joe Rinkovsky Unix Systems Support Group Indiana University
Introduction • IU has around 2000 Windows PCs in public Student Technology Centers • Condor is used to harvest unused cycles • Simple Message Brokering Library(SMBL) used for parallelizing applications on Windows • Web portal for user interaction
Project History • SETI@home Was used as initial test of Condor • SMBL was created to address the lack of a general purpose parallel library on Windows that could tolerate sporadically available systems • FastDNAml was ported to SMBL • Web portal created • Other apps ported to SMBL(MEME,BLAST)
System Architecture • Condor “server” running on Linux • BLAST databases served via Samba on a second Linux machine • Apache/MySQL/PHP web portal • Windows “clients”
What is SMBL? • Simple Message Brokering Library • Open Source(http://smbl.sf.net) • Uses master / worker model • Process and Port Manager(PPM) manages SMBL servers and master processes • Number of master /foreman processes is different for each application • SMBL workers contact the SMBL master to get work • SMBL server terminates workers when they are no longer needed
Condor and SMBL • Condor is used as the scheduling and delivery system for SMBL workers • SMBL workers contact the SMBL server when they start running to begin receiving work. • SMBL server seperates the work to be into smaller pieces depending on the number of workers • Work is redistributed if a worker is “lost” • SMBL server terminates workers when there is no work left
Applications using SMBL • FastDNAml – Generates phylogenic trees from molecular data • MEME – Detects patterns in nucleotide and protein sequences • NCBI BLAST(blastall) – Query molecular sequences against sequence databases
The Challenges of porting BLAST to SMBL • BLAST relies on the availability of large database files • Files too large for efficient delivery via Condor • Local copies of databases on pool machines would be difficult to manage • Sharing DB files via Samba is the best solution • Samba was moved to a seperate server to increase perfomance
The Challenges of porting BLAST to SMBL(cont.) • BLAST jobs take more time to complete than FastDNAml and MEME • Dissapearing worker problem • Pool machines would end up in CLAIMED/IDLE state • Size of our Condor pool made the problem hard to track • Only jobs taking more than 30 minutes were affected • Problem was determined to be state table “sessions” timing out on the machine room firewall. • Machines were removed from firewall and switched to host-based iptables firewall.
Web portal • Apache/MySQL/PHP based • Jobs are submitted via portal ONLY • Condor submit files are dynamically generated based on user input • Status of jobs can be checked using the portal • Results retrieved from the portal