371 likes | 552 Views
Berkeley Open Infrastructure for Network Computing An open-source middleware system for volunteer and grid computing
E N D
Berkeley Open Infrastructure for Network Computing An open-source middleware system for volunteer and grid computing (much of the images and text for this presentation are 'borrowed' under fair use and/or creative commons from boinc.berkeley.edu, wikipedia.com, westminster.ac.uk, cs.umd.edu, and cern.ch) BOINC
What is BOINC? • Software that enables computers to put otherwise unused CPU and GPU cycles in to use for computationally intense projects, usually in science or math related fields • Originally developed to manage the SETI@home project • Designed to address security and malicious user issues that arose in SETI@home prior to the development of BOINC • Generalized to be an open platform that can run any sufficiently parallelized application
Why BOINC? • Supercomputing on the cheap (all you need is a ~$5000 server to coordinate a project) • Utilization of normally under-utilized computing resources (think about the computer labs on your campus) • Spreading of awareness • Sense of ownership and participation by the public
BOINC stats • 287,945 volunteers • 835,757 computers • 50+ scientific projects open for the public to volunteer • 24-hour average computational output: 5.752 petaFLOPS • Near infinite expandability
For comparison... • the fastest supercomputer in use in the world is the Tianhe-1A • peak performance of 2.507 petaFLOPS • Cost $88 million to build • Costs $20 million annually to power and operate • Requires a full time staff of 200 to operate
Volunteer computing vs Grid computing • In Grid computing, organizations share resources. Any organization that is part of a grid computing effort can act as either an producer or consumer of resources. • In Volunteer computing, volunteer individuals or organizations act as producers of resources, and only the coordinating organization may act as a consumer of resources.
How is BOINC used? • Volunteer computing projects for science and math • Virtual campus supercomputing centers (e.g univ. of Westminster in London) • Desktop grids for business • Integration with Condor to allow Globus-based grids to run jobs for BOINC projects (e.g. the Open Science Grid Einstein@OSG project)
Basic overview of BOINC jobs • Your PC gets a set of tasks from the project's scheduling server. Available tasks are constrained by the capabilities of your PC. • Your PC downloads executable and input files from the project's data server. If the project releases new versions of its applications, the executable files are downloaded automatically to your PC. • Your PC runs the application programs, producing output files. • Your PC uploads the output files to the data server. • Later, your PC reports the completed tasks to the scheduling server, and gets new tasks.
The BOINC client Notices tab: displays news from the projects in which you participate
The BOINC client Projects tab: overview of all your projects
The BOINC client Tasks tab: info on specific tasks within each project
The BOINC client Transfers tab: shows status of file transfers for all projects
The BOINC client Statistics tab: graphs your contribution to projects over time
The BOINC client Disk tab: shows BOINC's overall hard disk usage, and usage of each project
Using the BOINC client Selecting a project to volunteer for
Using the BOINC client Preferences dialog controls CPU, GPU, disk, network, and memory usage so that BOINC only consumes resources as directed by the user.
For each application, the BOINC core client creates a segment of shared memory that is used to pass messages between the core client and the application.
The BOINC client keeps track of how many CPU cycles are used in computation, and reports this information back to the project server. • When at least 2 clients have reported completion of a task, the lower of their reported CPU cycles is used to calculate credit. • BOINC's unit of credit, the Cobblestone (named after Jeff Cobb of SETI@home), is 1/200 day of CPU time on a reference computer that does 1,000 MFLOPS based on the Whetstone benchmark
If an application errors, its standard error is written to a file and transmitted back to the project server for analysis. • If an application crashes or is aborted, a stack trace is written to standard error.
Generating work in BOINC (server side) Multiple jobs with different input files (file_1, file_2, etc)
Popular BOINC projects • SETI@home - analysis of radio telescope data, looking for patterns that may indicate the presence of extra-terrestrial intelligent life • Folding@home - simulates protein folding for biological research • Climateprediction.net – forecasts weather and other climate conditions • Einstein@home - search for spinning neutron stars (also called pulsars) using data from the LIGO and GEO gravitational wave detectors
Popular BOINC projects • MilkyWay@home – 3d modeling of the Milky Way galaxy • PrimeGrid – discovers large prime numbers • World Community Grid – general purpose grid for humanitarian research • Rosetta@home - tries to determine the 3-dimensional shapes of proteins in research that may ultimately lead to finding cures for some major human diseases
Atom smasher?? It may not look like much, but this computer simulates high-energy particle collisions every day.
Test4theory • A scientific application that uses BOINC as middleware to facilitate volunteer computing. • A project of CERN that runs simulations of Large Hadron Collider experiments on volunteer machines. • Computer simulations of high-energy particle collisions provide a detailed theoretical reference for the measurements performed at accelerators like the Large Hadron Collider (LHC), against which models of both known and 'new' physics can be tested, down to the level of individual particles.
By looking for discrepancies between the simulations and the data, we are searching for any sign of disagreement between the current theories and the physical universe. Ultimately, such a disagreement could lead us to the discovery of new phenomena, which may be associated with new fundamental principles of Nature • Less spectacular discrepancies also help guide us towards the most accurate possible description of the Standard Model of Particle Physics and its phenomena - refining the simulations of the known physical laws, by pointing to areas where current simulations succeed and where they fail.
Top 10 test4theory participants … and me, ranked #451.
Campus Supercomputing Grid- University of Westminster • Grid consists of ~1500 desktop PCs in labs all over campus • Represents only about half of the PCs owned by the university • Computing power equivalent to a £500,000 cluster procurement or supercomputer • Cost to maintain is negligible... PCs on the grid are replaced and upgraded as normal from existing budgets. • Environmental impact and energy usage are also negligible (for the same reasons)
Campus Supercomputing Grid- University of Westminster • PCs simply need to have BOINC client installed. Settings can be locked down by campus IT admins so that only the campus' own BOINC projects can be run on university owned PCs. • Project is coordinated from a single web server housed on campus. • Any university department with projects that can benefit from supercomputing resources can use the grid.
University of Maryland:The Lattice Project • The Lattice project incorporates the Globus toolkit and BOINC services with a higher level grid scheduler. • Jobs are submitted to the grid, and are then assigned to either the pool of BOINC clients, a cluster master node, or an instance of Condor. • Distribution of jobs is based on estimated run time and other resources needed. • Since its inception, The Lattice Project has performed 21,064.71 CPU Years of computation.
Applications that run on the Lattice Project