240 likes | 410 Views
Parallel Tomography. Shava Smallen CSE Dept. U.C. San Diego. Parallel Tomography. Tomography: reconstruction of 3D image from 2D projections Used for electron microscopy at National Center for Microscopy and Imaging Research (NCMIR) Coarse-grain, embarrassingly parallel application
E N D
Parallel Tomography Shava Smallen CSE Dept. U.C. San Diego
Parallel Tomography • Tomography: reconstruction of 3D image from 2D projections • Used for electron microscopy at National Center for Microscopy and Imaging Research (NCMIR) • Coarse-grain, embarrassingly parallel application • Goal of project: Achieve performance using AppLeS scheduling and Globus services
driver preprocessor ptomo source reader writer dest ptomo ptomo Parallel Tomography Structure Off-line Solid lines = data flow dashed lines = control
driver ptomo source writer dest ptomo reader ptomo Parallel Tomography Structure On-line preprocessor Solid lines = data flow dashed lines = control
Parallel Tomography Structure • driver: directs communication among processes, controls work queue • preprocessor: serially formats raw data from microscope for parallel processing • reader: reads preprocessed data and passes it to a ptomo process • ptomo: performs tomography processing • writer: writes processed data to destination (i.e. visualization device, tape)
NCMIR Environment • Platform: collection of heterogeneous resources • workstations (e.g. sgi, sun) • NPACI supercomputer time • (e.g. SP-2, T3E) • How can users achieve execution performance in heterogeneous, multi-user environments?
Performance and Scheduling • Users want to achieve minimum turnaround time in the following scenarios: • off-line: already collected data set • on-line: data streamed from electron microscope • Goal of our project is to develop adaptive scheduling strategies which promote both performance and flexibility for tomography in multi-user Globus environments
Adaptive Scheduling strategy • Develop schedule which adapts to deliverable resource performance at execution time • Application scheduler will dynamically • select sets of resources based on user-defined performance measure • plan possible schedules for each set of feasible resources • predict the performance for each schedule • implement best predicted schedule on selected infrastructure
AppLeS = Application-Level Scheduling • Each AppLeS is an adaptive application scheduler • AppLeS+application = self-scheduling application • scheduling decisions based on • dynamic information (i.e. resource load forecasts) • static application and system information
Resource Selection • available resources • workstations • run immediately • execution may be slow due to load • supercomputers • may have to wait in a queue • execution fast on dedicated nodes • We want to schedule using both types of resources together for an improved execution performance
Allocation Strategy • We have developed a strategy which simultaneously schedules on both • workstations • immediately available supercomputer nodes • avoid wait time in the batch queue • information is exported by batch scheduler • Overall, this strategy performs better than running on either type of resource alone
SDSC SP-2 PCL W W W W W W NCMIR W W W W W W Preliminary Globus/AppLeS Tomography Experiments • Resources • 6 workstations available at Parallel Computation Laboratory (PCL) at UCSD • immediately available nodes on SDSC SP-2 (128 nodes) • Maui scheduler exports the number of immediately available nodes • e.g. 5 nodes available for the next 30 mins 10 nodes available for the next 10 mins • Globus installed everywhere
Allocation Strategies compared • 4 strategies compared: • SP2Immed/WS: workstations and immediately available SP-2 nodes • WS: workstations only • SP2Immed: immediately available SP-2 nodes only • SP2Queue(n): traditional batch queue submit using n nodes • experiments performed in production environment • ran experiments in sets, each set contains all strategies • e.g. SP2Immed, SP2Immed/WS, WS, SP2Queue(8) • within a set, experiments ran back-to-back
Targeting Globus • AppLeS uses Globus services • GRAM, GSI, & RSL • process startup on workstations and batch queue systems • remote process control • Nexus • interprocess communication • multi-threaded • callback functions for fault tolerance
Experiences with Globus • What we would like to see improved: • free node information in MDS (e.g. time availability, frequency) • steep learning curve to initial installation • knowledge of underlying Globus infrastructure • startup scripts • more documentation • more flexible configuration
Experiences with Globus • What worked: • once installed, software works well • responsiveness and willingness to help from Globus team • mailing list • web pages
Future work • Develop contention model to address network overloading which includes • network capacity information • model of application communication requirements • Expansion of allocation policy to include additional resources • other Globus supercomputers, reserved resources (GARA) • availability of resources • NPACI Alpha project with NCMIR and Globus
AppLeS • AppLeS/NCMIR/Globus Tomography Project • Shava Smallen, Jim Hayes, Fran Berman, Rich Wolski, Walfredo Cirne (AppLeS) • Mark Ellisman, Marty Hadida-Hassan, Jaime Frey (NCMIR), • Carl Kesselman, Mei-Hui Su (Globus) • AppLeS Home Page • www-cse.ucsd.edu/groups/hpcl/apples • ssmallen@cs.ucsd.edu