We adopted the GPU (graphics processing unit) to accelerate the large-scale finite-difference simulation of seismic wave propagation. We describe the main part of our implementation: the memory optimization, the three-dimensional domain decomposition, and overlapping communication and computation. With our GPU program, we achieved a very high single-precision performance of about 61 TFlops by using 1,200 GPUs and 1.5 TB of total memory, and a scalability nearly proportional to the number of GPUs on TSUBAME–2.0, the recently installed GPU supercomputer in Tokyo Institute of Technology, Japan. In a realistic application by using 400 GPUs, only a wall clock time of 2,068 s (including the times for the overhead of snapshot output) was required for a complex structure model with more than 13 billion unit cells and 20,000 time steps. We therefore conclude that GPU computing for large-scale simulation of seismic wave propagation is a promising approach.