300 likes | 483 Views
Monte Carlo simulation for radiotherapy in a distributed computing environment. S. Chauvie 2,3 , S. Guatelli 2 , A. Mantero 2 , J. Moscicki 1 , M.G. Pia 2 CERN 1 INFN 2 S. Croce e Carle Hospital Cuneo 3. Monte Carlo 2005 18-21 April 2005 Chattanooga, TN, USA.
E N D
Monte Carlo simulation for radiotherapy in a distributed computing environment S. Chauvie2,3, S. Guatelli2, A. Mantero2, J. Moscicki1, M.G. Pia2 CERN1 INFN2 S. Croce e Carle Hospital Cuneo3 Monte Carlo 2005 18-21 April 2005 Chattanooga, TN, USA
Monte Carlo methods in radiotherapy Monte Carlo methods have been explored for years as a tool for precise dosimetry, in alternative to analytical methods de facto, Monte Carlo simulation is not used in clinical practice (only side studies) The major limiting factor is the speed
The reality Treatment planning is performed by means of commercial software The software calculates the dose distribution delivered to the patient Open issues Advantages Disadvantages Commercial systems are based on analytical methods Fails in calculate dose in heterogeneities and for small or complex field Quick response Each treatment planning software is specific to one radiotherapic technique
Project Develop a dosimetric system for radiotherapy treatments based on Monte Carlo methods Calculation precision Geant4 as Simulation Toolkit Quick response Parallelisation Access to distributed computing resources
Pilot project: distributed simulation for brachytherapy • Explore Geant4-based Monte Carlo simulations in a distributed computing environment • Parallel execution in a local PC farm • Geographically distributed execution on the GRID • Pilot project based on an existing simulation for brachytherapy • Focus on architectural design • Transparent execution on a single machine, in parallel on a local farm or on the GRID • Preliminary evaluation of performance • Application to other radiotherapy simulations currently in progress
Plan containing the radioactive source source DoseDistribution Brachytherapy • Simulation of the energy deposited by a radioactive source in a phantom • Requirement from clinical practice: real time response Bebig Isoseed I-125 source Talk: “A general purpose dosimetric system for brachytherapy”, 20th April, MC 2005, Room 5
Performance in sequential mode Endocavitary brachytherapy 1M events 61 minutes Superficial brachytherapy 1M events 65 minutes Interstitial brachytherapy 1M events 67 minutes on an “average” PIII machine Monte Carlo simulation is not practically conceivable for clinical application, even if more precise
Speed adequate for clinic use Parallelisation Transparent configuration in sequential or parallel mode Access to distributed computing resources Transparent access to the GRID through an intermediate software layer
speed OK but expensive hardware investment + maintenance Hospital LAN Node01 SW I T C H Node02 Node03 Node04 Access to distributed computing Geant4 Simulation and Anaphe analysis on a dedicated Beowulf Cluster S. Chauvie et al., IRCC Torino,Siena 2002 IMRT
Access to distributed computing Alternative strategy DIANE Transparentaccess to a distributed computing environment Parallelisation Access to the GRID
Active Workflow Framework for Parallel Jobs • Applications run inside an Active Workflow Framework • For applications: • underlying environment is transparent • code changes to use the framework are minimal • The Framework provides: • Automatic Communication and Synchronization of tasks • Error recovery • Optimization
prototype for an intermediate layer between applications and the GRID DIANE DIstributed ANalysis Environment Hide complex details of underlying technology Developed by J. Moscicki, CERN Parallel cluster processing • make fine tuning and customisation easy • transparently using GRID technology • application independent http://cern.ch/DIANE
DIANE architecture Master-Worker model Parallel execution of independent tasks Very typical in many scientific applications Usually applied in local clusters R&D in progress for Large Scale Master-Worker Computing
Master - Worker Computing • Workers are started up and register to Master • Client connects to Master and starts up the job • Master controls the execution, dispatches tasks to Workers and combines the result • Client receives notifications about the current status of the job and collects the final result
Running in a distributed environment • Not affecting the original code of application • standalone and distributed case is the same code • Good separation of the subsystems • the application does not need to know that it runs in distributed environment • the distributed framework (DIANE) does not need to care about what actions an application performs internally The application developer is shielded from the complexity of underlying technology via DIANE
Distributed environments Different distributed environments: local computing farm GRID
Parallel mode: local cluster / GRID • Both applications have the same computing model • a job consists of a number of independent tasks which may be executed in parallel • result of each task is a small data packet (few kilobytes), which is merged as the job runs • In a cluster: • computing resources are used for parallel execution • user connects to a possibly remote cluster • input data for the job must be available on the site • typically there is a shared file system and a queuing system • network is fast • GRID computing uses resources from multiple computing centres • typically there is no shared file system • (parts of) input data must be replicated in remote sites • network connection is slower than within a cluster
Development costs • Strategy to minimise the cost of migrating a Geant4 simulation to a distributed environment for users • DIANE Active Workflow framework • provides automatic communication/synchronization mechanisms • application is “glued” to the framework using a small Python module; in most cases no code changes to the original application are required • load balancing and error recovery policies may be plugged in form of simple python functions • Transparent adaptation for Clusters/GRIDs, shared/local file systems, shared/private queues • Cost in the runtime phase: • near zero (except for loading networking libraries for the first time) • Development/modification of application code • original source code unmodified • addition of an interface class which binds together application and M-W framework
Interfacing a Geant4 simulation to DIANE UML Deployment Diagram for Geant4 applications
Practical example: G4 simulation with analysis • Each task produces a file with histograms • The job result is the sum of histograms produced by tasks • Master-worker model • client starts a job • workers perform tasks and produce histograms • master integrates the results • Distributed Processing for Geant4 Applications • task = N events • job = M tasks • tasks may be executed in parallel • tasks produce histograms/ntuples • task output is automatically combined (add histograms, append ntuples) • Master-Worker Model • Master steers the execution of job, automatically splits the job and merges the results • Worker initializes the Geant4 application and executes macros • Client gets the results
DIANE Prototype and Testing • Scalability tests • 70 worker nodes • 140 milion Geant 4 events
preliminary: further optimisation in progress Performance : parallel mode Endocavitary brachytherapy 1M events 4 minutes 34’’ Superficial brachytherapy 1M events 4 minutes 25’’ 5M events 4 minutes 36’’ Interstitial brachytherapy on up to 50 workers, LSF at CERN, PIII machine, 500-1000 MHz Performance adequate for clinical application, but… it is not realistic to expect any hospital to own and maintain a PC farm
Parallel mode: distributed resources Distributed Geant 4 Simulation: DIANE framework and generic GRID middleware
Wave of interest in grid technology as a basis for “revolution” in e-Science and e-Commerce Grid Ian Foster and Carl Kesselman's book: ”A computational Grid is a hardware and software infrastructure that provides dependable, consistent , pervasive and inexpensive access to high-end computational capabilities”". An infrastructure and standard interfaces capable of providing transparent access to geographically distributed computing power and storage space in a uniform way Many GRID R&D projects, many related to HEP US projects European projects
Running on the GRID • Via DIANE • Same application code as running on a sequential machine or on a dedicated cluster • completely transparent to the user A hospital is not required to own and maintain extensive computing resources to exploit the scientific advantages of Monte Carlo simulation for radiotherapy Any hospital – even small ones, or in less wealthy countries, that cannot afford expensive commercial software systems – may have access to advanced software technologies and tools for radiotherapy
Current #Grid setup (computing elements): 5000 events, 2 workers, 10 tasks (500 events each) - aocegrid.uab.es:2119/jobmanager-pbs-workq - bee001.ific.uv.es:2119/jobmanager-pbs-qgrid - cgnode00.di.uoa.gr:2119/jobmanager-pbs-workq - cms.fuw.edu.pl:2119/jobmanager-pbs-workq - grid01.physics.auth.gr:2119/jobmanager-pbs-workq - xg001.inp.demokritos.gr:2119/jobmanager-pbs-workq - xgrid.icm.edu.pl:2119/jobmanager-pbs-workq - zeus24.cyf-kr.edu.pl:2119/jobmanager-pbs-infinite - zeus24.cyf-kr.edu.pl:2119/jobmanager-pbs-long - zeus24.cyf-kr.edu.pl:2119/jobmanager-pbs-medium - zeus24.cyf-kr.edu.pl:2119/jobmanager-pbs-short - ce01.lip.pt:2119/jobmanager-pbs-qgrid Spain Greece Poland Portugal Traceback from a run on CrossGrid testbed Resource broker running in Portugal matchmaking CrossGrid computing elements
Study in progress • Capability of transparent execution of the radiotherapy simulation on the GRID has been demonstrated • Quantitative evaluation of performance speed and stability currently in progress • A comprehensive study will be submitted for publication in the coming weeks • Optimisation of load balancing, error handling and other issues concerning access to distributed resources currently under study
Lateral profile 6MV, 5x5cm field, 15mm depth Application to IMRT simulations • Determine the dose distribution in a phantom generated by the head of a linear accelerator • Requirement from clinical practice: fast response Without parallelisation: 1010 events 100 CPU days on Pentium IV 3 GHz Talk: “Geant4 Simulation of an Accelerator Head for Intensity Modulated RadioTherapy”, 19th April, MC 2005, Room 6
Conclusions • Fast performance • parallel processing • Access to geographically distributed computing resources • GRID • Demonstrated with Geant4 simulation applications + DIANE • More information • cern.ch/diane • http://www.ge.infn.it/geant4 • www.ge.infn.it/geant4/techtransf • aida.freehep.org