1 / 22

Abstract

Cost Effective Parallel Seismic Computing Using a PC Cluster Paul L. Stoffa The University of Texas at Austin Institute for Geophysics 4412 Spicewood Springs Road, Building 600 Austin, Texas 78759-8500.

Download Presentation

Abstract

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cost Effective Parallel Seismic Computing Using a PC ClusterPaul L. StoffaThe University of Texas at AustinInstitute for Geophysics4412 Spicewood Springs Road, Building 600Austin, Texas 78759-8500 To view as a slide show, under “Slide Show” on the menu bar, select “View Show.” Press the ‘enter’ key to view next slide. Press ‘Esc’ to view individual slides.

  2. Cost Effective Parallel Seismic Computing Using a PC ClusterPaul L. StoffaThe University of Texas at AustinInstitute for Geophysics4412 Spicewood Springs Road, Building 600Austin, Texas 78759-8500 Abstract Nearly every seismic imaging algorithm can be put into the class of ‘embarrassingly’ parallel. Decomposition of seismic data into frequency slices, plane wave volumes or simply spatial cubes makes it straightforward to design parallel algorithms that take advantage of these global decompositions. But, important software design compromises must be made to optimize the input/output of data to and from mass storage devices. Recent experience with an SGI Origin 2000 and a PC cluster using alpha processors shows that a very cost effective solution can be achieved for large seismic imaging problems without the need and cost of a shared memory architecture.

  3. For small seismic images that can fit in the memory of the combined nodes available a multi node in memory transpose can be efficiently performed.

  4. Px2, Py2 Px1, Py1

  5. Ox2, Oy2 Ox1, Oy1

  6. Programming is accomplished using a suite of Fortran90 subroutines that are built upon MPI or PVM. These perform the basic data transfer functions between nodes. A basic node to node transfer and a broadcast from one node to all other nodes are two of the fundamental building blocks for all the parallel seismic imaging algorithms.

  7. As each node completes it’s image the results are merged into the final composite image using a power of 2 global sum.

  8. Data are transferred cyclically between nodes using a generalized token ring transfer.

  9. A data matrix distributed between nodes can be transposed using one of the subroutines in the library. During the transpose a fast Fourier transform can be included as part of the transpose algorithm.

  10. Imaging programs originally coded for the Cray T3e has been moved to the SGI Origin 2000 and the Alpha cluster using these subroutines. The split-step imaging algorithm described here requires data transposes, fast Fourier transforms and complex phase shifts to downward continue the seismic data.

  11. Most 3D seismic imaging problems do not fit into the available combined node memory and the transpose process requires the use of disk storage. The transpose process must be repeated using several memory loads.

  12. As each new data load is transposed, the partial slices are re read from disk and combined with the new transposed data and then restored for the next transpose cycle.

  13. UTIG purchased an Origin 2000 for 3D seismic imaging. This machine’s shared memory architecture and high I/O bandwidth make this machine suitable for a wide range of seismic imaging algorithms. Programs originally developed for the Cray T3e and based on the parallel subroutine library described earlier were easily ported to this machine.

  14. The Origin 2000 could not be cost effectively expanded since another chassis was required to go beyond the original 8 node configuration. Also, desktop PC’s using the DEC Alpha processor were found to have better (2-3x) floating point performance on the seismic imaging kernals.

  15. The performance differences between the two machines are the processor floating point speeds and the I/O. The alpha processors running at 667 MHz and supported by 8 MB cache outperform the SGI nodes in all the seismic imaging algorithms. The SGI I/O is, however, superior to the cluster as originally configured. The end result is that the net performance of the two systems as described here are comparable.

  16. x–t data • input seismic data: shot gathers in the x–t domain • number of shots 1000 • number of offsets/shot 60 • number of samples/trace (time/depth) 2000 • migration results: CIG gathers • number of CIG’s 1000 • number of traces/CIG 60 • number of samples 1000 • t-p data • input seismic data: shot gathers in the t–p domain • number of shots 1000 • number of rays/shot 61 • number of samples/trace 2000 • migration results: CIG gathers • number of CIG’s 1000 • number of rays/CIG 61 • number of samples/trace (time/depth) 1000 Two types of 2D input seismic data were used for performance testing. All algorithms are 3D and reduce to the 2D case. x-t and plane wave data were used for both time and depth migration comparisons.

  17. Plane wave time migration required less I/O than conventional x-t time migration. Hence the alpha cluster outperforms the SGI slightly. Note the significant decrease in performance of the SGI when the 8th node is included. All tests were performed on an otherwise quiet system.

  18. Split-step depth migration was also better on the alpha cluster since the amount of computation per I/O request is larger than in the other imaging algorithms.

  19. X-t time migration was marginally faster on the SGI since more I/O is required than in the previous two algorithms. Again, the addition of the 8th SGI node actually degrades performance.

  20. Conclusions • Beowulf Alpha Cluster is a cost effective solution for parallel seismicimaging • performance is ‘comparable’ to SGI Origin 2000 forimaging algorithms tested • cost is $71,915 vs. $465,098 for SGI Origin 2000 • price performance improvement of 6.5 • network interconnect from cluster to large UNIX network is not a performance bottleneck • need better cluster ‘management’ tools to facilitate use • need true 64 bit cluster operating system • Conclusions • Beowulf Alpha Cluster is a cost effective solution for parallel seismicimaging • performance is ‘comparable’ to SGI Origin 2000 forimaging algorithms tested • cost is $71,915 vs. $465,098 for SGI Origin 2000 • price performance improvement of 6.5 • network interconnect from cluster to large UNIX network is not a performance bottleneck • need better cluster ‘management’ tools to facilitate use • need true 64 bit cluster operating system

More Related