Accelerating large-scale simulation of seismic wave propagation by multi-GPUs and three-dimensional domain decomposition

Taro Okamoto, Hiroshi Takenaka, Takeshi Nakamura, Takayuki Aoki

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

We adopted the GPU (graphics processing unit) to accelerate the large-scale finite-difference simulation of seismic wave propagation. The simulation can benefit from the high-memory bandwidth of GPU because it is a "memory intensive" problem. In a single-GPU case we achieved a performance of about 56 GFlops, which was about 45-fold faster than that achieved by a single core of the host central processing unit (CPU). We confirmed that the optimized use of fast shared memory and registers were essential for performance. In the multi-GPU case with three-dimensional domain decomposition, the non-contiguous memory alignment in the ghost zones was found to impose quite long time in data transfer between GPU and the host node. This problem was solved by using contiguous memory buffers for ghost zones. We achieved a performance of about 2.2 TFlops by using 120 GPUs and 330 GB of total memory: nearly (or more than) 2200 cores of host CPUs would be required to achieve the same performance. The weak scaling was nearly proportional to the number of GPUs. We therefore conclude that GPU computing for large-scale simulation of seismic wave propagation is a promising approach as a faster simulation is possible with reduced computational resources compared to CPUs.

Original languageEnglish
Pages (from-to)939-942
Number of pages4
JournalEarth, Planets and Space
Volume62
Issue number12
DOIs
Publication statusPublished - 2010
Externally publishedYes

Fingerprint

seismic waves
seismic wave
wave propagation
decomposition
simulation
ghosts
registers
central processing units
resources
buffers
alignment
fold
bandwidth
scaling
resource

Keywords

  • Finite-difference method
  • GPU
  • Parallel computing
  • Seismic wave propagation
  • Three-dimensional domain decomposition

ASJC Scopus subject areas

  • Geology
  • Space and Planetary Science

Cite this

Accelerating large-scale simulation of seismic wave propagation by multi-GPUs and three-dimensional domain decomposition. / Okamoto, Taro; Takenaka, Hiroshi; Nakamura, Takeshi; Aoki, Takayuki.

In: Earth, Planets and Space, Vol. 62, No. 12, 2010, p. 939-942.

Research output: Contribution to journalArticle

@article{08e24829473248fb81e7d58b4711af62,
title = "Accelerating large-scale simulation of seismic wave propagation by multi-GPUs and three-dimensional domain decomposition",
abstract = "We adopted the GPU (graphics processing unit) to accelerate the large-scale finite-difference simulation of seismic wave propagation. The simulation can benefit from the high-memory bandwidth of GPU because it is a {"}memory intensive{"} problem. In a single-GPU case we achieved a performance of about 56 GFlops, which was about 45-fold faster than that achieved by a single core of the host central processing unit (CPU). We confirmed that the optimized use of fast shared memory and registers were essential for performance. In the multi-GPU case with three-dimensional domain decomposition, the non-contiguous memory alignment in the ghost zones was found to impose quite long time in data transfer between GPU and the host node. This problem was solved by using contiguous memory buffers for ghost zones. We achieved a performance of about 2.2 TFlops by using 120 GPUs and 330 GB of total memory: nearly (or more than) 2200 cores of host CPUs would be required to achieve the same performance. The weak scaling was nearly proportional to the number of GPUs. We therefore conclude that GPU computing for large-scale simulation of seismic wave propagation is a promising approach as a faster simulation is possible with reduced computational resources compared to CPUs.",
keywords = "Finite-difference method, GPU, Parallel computing, Seismic wave propagation, Three-dimensional domain decomposition",
author = "Taro Okamoto and Hiroshi Takenaka and Takeshi Nakamura and Takayuki Aoki",
year = "2010",
doi = "10.5047/eps.2010.11.009",
language = "English",
volume = "62",
pages = "939--942",
journal = "Earth, Planets and Space",
issn = "1880-5981",
publisher = "Terra Scientific Publishing Company",
number = "12",

}

TY - JOUR

T1 - Accelerating large-scale simulation of seismic wave propagation by multi-GPUs and three-dimensional domain decomposition

AU - Okamoto, Taro

AU - Takenaka, Hiroshi

AU - Nakamura, Takeshi

AU - Aoki, Takayuki

PY - 2010

Y1 - 2010

N2 - We adopted the GPU (graphics processing unit) to accelerate the large-scale finite-difference simulation of seismic wave propagation. The simulation can benefit from the high-memory bandwidth of GPU because it is a "memory intensive" problem. In a single-GPU case we achieved a performance of about 56 GFlops, which was about 45-fold faster than that achieved by a single core of the host central processing unit (CPU). We confirmed that the optimized use of fast shared memory and registers were essential for performance. In the multi-GPU case with three-dimensional domain decomposition, the non-contiguous memory alignment in the ghost zones was found to impose quite long time in data transfer between GPU and the host node. This problem was solved by using contiguous memory buffers for ghost zones. We achieved a performance of about 2.2 TFlops by using 120 GPUs and 330 GB of total memory: nearly (or more than) 2200 cores of host CPUs would be required to achieve the same performance. The weak scaling was nearly proportional to the number of GPUs. We therefore conclude that GPU computing for large-scale simulation of seismic wave propagation is a promising approach as a faster simulation is possible with reduced computational resources compared to CPUs.

AB - We adopted the GPU (graphics processing unit) to accelerate the large-scale finite-difference simulation of seismic wave propagation. The simulation can benefit from the high-memory bandwidth of GPU because it is a "memory intensive" problem. In a single-GPU case we achieved a performance of about 56 GFlops, which was about 45-fold faster than that achieved by a single core of the host central processing unit (CPU). We confirmed that the optimized use of fast shared memory and registers were essential for performance. In the multi-GPU case with three-dimensional domain decomposition, the non-contiguous memory alignment in the ghost zones was found to impose quite long time in data transfer between GPU and the host node. This problem was solved by using contiguous memory buffers for ghost zones. We achieved a performance of about 2.2 TFlops by using 120 GPUs and 330 GB of total memory: nearly (or more than) 2200 cores of host CPUs would be required to achieve the same performance. The weak scaling was nearly proportional to the number of GPUs. We therefore conclude that GPU computing for large-scale simulation of seismic wave propagation is a promising approach as a faster simulation is possible with reduced computational resources compared to CPUs.

KW - Finite-difference method

KW - GPU

KW - Parallel computing

KW - Seismic wave propagation

KW - Three-dimensional domain decomposition

UR - http://www.scopus.com/inward/record.url?scp=79952614920&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952614920&partnerID=8YFLogxK

U2 - 10.5047/eps.2010.11.009

DO - 10.5047/eps.2010.11.009

M3 - Article

VL - 62

SP - 939

EP - 942

JO - Earth, Planets and Space

JF - Earth, Planets and Space

SN - 1880-5981

IS - 12

ER -