Justification of methods for accelerating iterative loops nests

Бесплатный доступ

The acceleration of iterative algorithms, found in solving problems of mathematical physics, mathematical modeling, and image processing, is considered. In the software implementation of these algorithms, there are nested loops (sections of the program that consist of nested loops). These loop nests can be accelerated by combination of optimizing transformations, including tiling, hyperplane method, and parallelization on shared memory. The equivalence of this combination of program transformations is substantiated. A method for changing the order of tile traversal is proposed and justified. The method provides acceleration by increasing data readings from registers instead of slower memory. Considering this method, a formula for calculating optimal tile sizes is obtained. The combination of transformations presented in this article results in an acceleration that is 1.4 times greater than the well-known optimization algorithm implemented in PLUTO. In some cases using an 8-core processor, numerical experiments show a significant increase in speed compared to the original sequential algorithms. The findings of this article can be applied to manual and automatic program optimization.

Еще

Tiling, wavefront, parallelization, shared memory, iterative stencil loops

Короткий адрес: https://sciup.org/143182404

IDR: 143182404   |   DOI: 10.25209/2079-3316-2024-15-1-63-94

Статья научная