240 likes | 567 Views
2. LHC@home. Calculates stability of proton orbits in CERN's new LHC acceleratorSystem is nonlinear and unstable so numerically very sensitive. Hard to get identical results on all platformsAbout 40 000 users, 70 000 PC's
E N D
1. 1 BOINC-VM and Volunteer Cloud Computing Ben Segal / CERN
and:
Predrag Buncic, Jakob Blomer, Pere Mato / CERN
Carlos Aguado Sanchez, Artem Harutyunyan / CERN
Jie Wu, Wenjing Wu / CAS-IHEP, Beijing
Rohit Yadav / Banares Hindu University
David Garcia Quintas, Yushu Yao / Lawrence Berkeley Laboratory
Jarno Rantala / Tampere University of Technology
David Weir / Imperial College, London
6th BOINC Workshop, London
August 31, 2010
2. 2 LHC@home Calculates stability of proton orbits in CERNs new LHC accelerator
System is nonlinear and unstable so numerically very sensitive. Hard to get identical results on all platforms
About 40 000 users, 70 000 PCs over 1500 CPU years of processing
Objectives: extra CPU power and raising public awareness of CERN and the LHC - both successfully achieved.
Started as an outreach project for CERN 50th Anniversary 2004; used for Year of Physics (Einstein Year) 2005
3. 3
4. 4 BOINC & LHC physics code Problems with normal BOINC used for LHC physics:
A projects application(s) must be ported to each volunteer platform: most clients run Windows, but CERN runs Scientific Linux and porting to Windows is impractical. Each experiment has its own environment.
The physics code changes frequently so apps must be updated often.
The projects work must be fed into the BOINC server for distribution, and results recovered. Job submission scripts must be developed for this, but CERN physics experiments wont change their current setups.
Job management is primitive in BOINC, whereas physicists want to know where their jobs are and be able to manage them.
5. 5 BOINC & Virtualization
Summary of the basic approach
Solve client application porting problems using VMs
Use VMwrapper to link VMs to BOINC core client & server
Provide a host <-> guest-VM communication/control layer
.. and in addition ..
Solve the image size problem and physics job production interfaces using the CernVM project together with the Co-Pilot adapter system.
6. 6
7. 7
8. 8 Thin Software Appliance
9. 9 Key Building Blocks rBuilder from rPath (www.rpath.org)
A tool to build VM images for various virtualization platforms
rPath Linux 1
Slim Linux OS binary compatible with
Red Hat / SLC4
rAA - rPath Linux Appliance Agent
Web user interface
XMLRPC API
Can be fully customized and extended by means of plugins (?401)
CVMFS - CernVM file system
Read-only file system optimized for software distribution
Aggressive caching
Operational in offline mode
For as long as you stay within the cache
10. 10 CernVM File System
11. 11
12. 12 Co-Pilot: CernVMs Cloud Interface CernVM cloud worker nodes are sets of VMs created on clusters or Grids or Amazon EC2, etc.
They may be untrusted.
They each run a CernVM image with a Co-Pilot Agent inside.
( In our case nodes are simply BOINC Volunteer PCs )
The Co-Pilot Agents get and return jobs from an experiments own job management system (e.g. ATLAS/PanDA, ALICE/AlieN, etc.)
BOINC work-unit scheduling is not used at all.
The BOINC volunteer just sees a (very long-running) task which returns credit regularly, in proportion to CPU resources used.
13. 13
14. 14
15. 15
16. 16
17. 17 BOINC & Virtualization So CernVM and Co-Pilot allow us to solve our three problems (porting, job preparation, and job management) but to run guest VMs within a BOINC host we need a cross-platform solution for (at least):
control of (multiple) VMs on a host, including:
Start|Stop|pause|resume|reset|poweroff|savestate
as well as (for the general case):
command execution on guest VMs
file transfers from guests to host (and reverse)
18. 18 BOINC & Virtualization Details of the VM controller package:
(developed by David Garcia Quintas / LBL, ex-CERN)
Cross-platform support - based on Python (Windows, MacOSX, Linux ).
Uses Python packages:
Netifaces, Stomper, Twisted, Zope, simplejson, Chirp
Does asynchronous message passing between host and guest entities via a broker (e.g. ActiveMQ). Messages are XML/RPC based.
Supports:
control of multiple VMs on a host, including:
Start|Stop|pause|resume|reset|poweroff|savestate
command execution on guest VMs
file transfers from guests to host (and reverse) using Chirp
19. 19 Host to VM Guest communication
20. 20 BOINC & Virtualization Details of the new BOINC VMwrapper:
(developed by Jarno Rantala / CERN openlab student, 2009)
Written in Python, therefore multi-platform
Uses VM controller infrastructure described above
Fully back-compatible with original BOINC Wrapper
Supports standard BOINC job.xml files
For VM case, supports extra tags in the job.xml file
Able to measure the VM guest resources and issue credit requests
Uses BOINC trickle message facility for very long-running work units
21. 21 BOINC VMwrapper architecture
22. 22 BOINC & Virtualization The above BOINC-VM General Solution is being finished and packaged as part of a Bachelor Thesis project by:
Rohit Yadav / CERN summer student, 2010
with help from: David Garcia Quintas / LBL and ex-CERN
Has general host <-> guest-VM functionality (not restricted to BOINC)
Will support VMware as well as VirtualBox hypervisors
Supports multiple VMs per host
Supports Linux, Windows and Mac OSX platforms
Etc, etc.
23. 23 BOINC & Virtualization Details of the new BOINC CernVMwrapper:
(developed by Jie Wu / CERN openlab student, 2010)
Written in C++, easy to port to Linux, Windows and Mac OSX
Based on original BOINC Wrapper, with enough BOINC support to run our LHC Cloud application using CernVM and Co-Pilot
Minimal VM functionality, enough to launch and control a VirtualBox VM
Uses the VirtualBox COM programming interface
Able to measure the VM guest resources and issue credit requests
Uses BOINC trickle message facility for very long-running work units
24. 24 BOINC Virtual Cloud
Summary of the method:
New BOINC wrapper (VMwrapper) used to start a guest VM in a BOINC client PC, and execute a CernVM image.
The CernVM image has all LHC software and CoPilot code.
Host-to-VM communication/control provided for BOINC client.
The new VMwrapper gives BOINC client and server all the functions they need - they are unaware of VMs.
The CoPilot connection allows LHC job production to proceed without changes, using experiments own job control systems.
25. 25 Building a Volunteer Cloud
Final Summary:
Solved porting problem to all client platforms
Solved image size problem
Solved job production interface problem
All done without changing existing BOINC infrastructure (client or server side)
All done without changing physicists code or procedures
We have built a Volunteer Cloud demo follows