An approach to estimate the locality of grained computation processes organized to a two-dimensional structure

Бесплатный доступ

Implementing algorithms for distributed memory computers, one should pay attention to locality, a computational property of the algorithm that reflects the usage of fast access memory, that plays an important role in achieving high performance. For computing systems with distributed memory the local memory of each node is considered fast. Usually the parallel algorithms have better locality if it has a smaller total size of communication data and a better structure of communication operations. In this work, a new parallel algorithm for the numerical solution of a three-dimensional linear heat equation is constructed on the basis of a splitting method with weights. The algorithm is specified for distributed memory computers, combines pipeline and natural parallelism, and uses a 2D process structure. For the case of 1D process structure it was shown that usage of a pipeline parallelism instead of part of natural parallelism reduces the total size of communication operations. The proposed 2D algorithm generalizes the known 1D algorithm. The usage of two-dimensional structures allows to reduce the total volume of communication operations and to reduce idle time of a pipeline. Therefore, the 2D algorithm has better locality compared to the 1D process structure. The computational experiments on a distributed computing sytem have shown the advantage of a new parallel algorithm. Proposed method can be used to obtain and to examine parallel algorithms for other schemes of the fractional step method. Additionally, the implemented splitting scheme algorithm shows an approach to obtain asymptotic estimation of the communication volume, presented method operates with granular (the macrooperation level) parallel computing processes organized logically into a two-dimensional structure. The estimations can be used to compare communication cost for alternative versions of parallel algorithms. The approach is developed on the basis of early result that allows to obtain asymptotic estimations of the communication operations volume of computational processes organized in a one-dimensional structure

Еще

Parallel computing, distributed memory parallel computer, data communication reduction, fractional step method.

Короткий адрес: https://sciup.org/147234531

IDR: 147234531   |   DOI: 10.14529/cmse210303

Статья научная