140 likes | 259 Views
UNIVERSITÁ DEGLI STUDI DI MILANO Facoltà di Scienze del Farmaco. A new distributed paradigm for parallel computing. Alessandro Pedretti. Typical scenario in a lab. Internet. Servers. PCs. Firewall. Several PCs with heterogeneous hardware / OSs.
E N D
UNIVERSITÁ DEGLI STUDI DI MILANO Facoltà di Scienze del Farmaco A new distributed paradigm for parallel computing Alessandro Pedretti
Typical scenario in a lab Internet Servers PCs Firewall • Several PCs with heterogeneous hardware / OSs. • Very high computational power “fragmented” on the local network. • Hard possibility to use all computational power to run a single complex calculation. Network devices Ethernet infrastructure 100-1000 Mbit/s
Main features • Parallel computing without cluster paradigm. • Client/server architecture with hot-plug capabilities. • Possibility to perform calculations with different pieces of software without changing the main code. • Expandable by scripting languages. • High-level database interface integrated in the main code and supporting the most common SQL database engines (Access, MySQL, SQLite, SQL Server, etc). • Easy configuration by graphic interface. • High performances and security.
Property calculation Molecule editing MM / MD calculations Surface mapping Trajectory analysis File format conversion Database engine Plug-in expandability Graphic interface Scripting languages What we need … … to develop WarpEngine: • High-level database interface. • Fast customizable Web server. • Script engine. • Graphic environment.
Server scheme Project manager Job manager Database engine VEGA ZZ core Client manager UDP server HTTP server Main program Optional encrypted tunnel provided by WarpGate IP filter PowerNet plug-in To clients TCP/IP, HTTP, broadcast
Client scheme PowerNet plug-in Main program Project manager Multithreaded worker VEGA ZZ core UDP client HTTP client To the server TCP/IP, HTTP, broadcast
37 cores 42 Gb ram > 3 Tb storage Hardware for the test • 1 PC configured as client and server: • Quad-core • 9 PC configured as client: • 1 six-core • 7 quad-core • 1 dual-core • 1 single-core • Operating systems: • 6 Windows 7 Pro x64 • 3 Windows 7 Pro • 1 Windows XP Pro • Network connection: • Ethernet 100 Mbs
Communication stress test: • delivery of empty jobs to the clients and receive of the result from them. • 79.651,78 jobs / min. • Apache Bench 2.0.41 • 100 requests with concurrency level of 5. • 3.205,13 pages / sec. • 1,560 ms / request • Database stress test: • extraction (by SQL query), decompression and delivery of molecules to the clients and answer. • 41.115,00 molecules / min. Preliminary performance test Microsoft IIS 6.0 1.066,67pages / sec. 4,688 ms / request
Software & data for the test • APBS – Adaptive Poisson-Boltzmann Solver • Calculation of solvation energy. • PLANTS – Protein-Ligand ANT system • Structure-based virtual screening. Both programs are single-threaded • Database of drugs in .mdb format • 174.398 molecules, average MW 353,70. • Human M2 muscarinic receptor • PDB ID: 3UON.
APBS – Solvation energy calculation. • 174.398 molecules, two APBS calculation for each molecule (reference and solvated state). • Time required by a single thread calculation: 13 days 5 hours • Time required by WarpEngine: 8 hours 36 minutes • WarpEngine speed: 339,10jobs / min. • PLANTS – Virtual screening. • 174.398 molecules, M2 target, search speed 2. • Time required by a single thread calculation: 36 days 22 hours • Time required by WarpEngine: 1 day 0 hour 1 minute • WarpEngine speed: 121,00jobs / min. Real case tests
The future … WarpEngine is easy expandable by scripting language, so it’s possible to add some other calculation types: • Semi-empirical calculations MOPAC • Ab-initio calculations FireFly / PC GAMESS • Other virtual screening methods AutoDock, Vina • Rescore of docking poses VEGA, XScore • Molecular mechanics calculations AMBER, AMMP, NAMD • Other applications …