Parallel solving of linear equations systems on hybrid architecture CPU+GPU

Бесплатный доступ

The article discusses the parallel implementation of solving systems of linear algebraic equations on computational nodes containing a central processing unit (CPU) and graphic accelerators (GPU). The performance of parallel algorithms for the classical conjugate gradient method schemes when using the CPU and GPU together is significantly limited by the synchronization points. The article investigates the pipeline version of the conjugate gradient method with one synchronization point, the possibility of asynchronous calculations, load balancing between the CPU and GPU when solving the large linear systems. Numerical experiments were carried out on test matrices and computational nodes of different performance of a heterogeneous cluster, which allowed us to estimate the contribution of communication costs. The algorithms are implemented with the joint use of technologies: MPI, OpenMP and CUDA. The proposed algorithms, in addition to reducing the execution time, allow solving large linear systems, for which there are not enough memory resources of one GPU or a computing node. At the same time, block algorithm with the pipelining decreases the total execution time by reducing synchronization points and aggregating some messages in one.

Еще

Parallel calculations, method of conjugate gradients, reduction of communications

Короткий адрес: https://sciup.org/147234271

IDR: 147234271   |   DOI: 10.14529/cmse200203

Статья научная