Efficient fragmented implementation of the two phase fluid boundary value problem
Автор: Kudryavtsev A.A., Malyshkin V.E., Nushtaev Yu.yu., Perepelkin V.A., Spirin V.A.
Журнал: Проблемы информатики @problem-info
Рубрика: Параллельное системное программирование и вычислительные технологии
Статья в выпуске: 2 (59), 2023 года.
Бесплатный доступ
Programs construction automation is an approach which can potentially reduce complexity and laboriousness of development, debugging and modification of numerical parallel programs for multicomputers. In high performance computing it is important not just to construct a valid program, but also to make it efficient, which is a challenging problem with no satisfactory general solution. Thus various programming systems are only capable of providing high efficiency of constructed programs for a limited range of applications. To achieve this the systems employ various heuristics and particular effective solutions. Evolution of parallel program construction automation means consists in accumulating such heuristics and particular solutions in order to improve efficiency of constructed programs, as well as to widen the range of applications the system can handle effectively. It is important to investigate various particular manual implementations of numerical programs from the perspective of the possibilities of further automation of such construction. Fragmented programming technology is an approach for numerical parallel programs development and construction automation. The approach is based on the theory of parallel programs synthesis on the basis of computational models. The approach is partially supported by LuNA system, which is a system for numerical parallel programs construction automation for distributed memory systems (multicomputers). The paper is devoted to study of a particular application - a two phase fluid boundary value problem solver for a 3D case and presence of wells. The application is implemented as a fragmented program in two versions: the first one is based on conventional means (MPI and OpcnMP), and the second one is using LuNA system. The basic idea behind fragmented programming is to consider a parallel program as an aggregate of sequential parts called computational fragments (CF). Each CF is implemented by a conventional sequential subroutine with no side effects. Input and output arguments of CFs are immutable pieces of data called data fragments (DFs). The execution process is considered as execution of a set of CFs in a data-flow manner, where each CF is ready for execution once all its input DFs are computed. CF’s execution produces a number of output DFs. If the program is represented as a set of CFs and DFs a system can be used to perform execution and provide dynamic properties of the execution, such as dynamic load balancing. LuNA system offers a domain specific language LuNA to describe the set of CFs and DFs as LuNA- program. The system then translates the program into an intermediate representation, executable bv the runtime subsystem. The runtime subsystem is basically a distributed virtual machine, which implements CFs execution in data-flow manner. Such an approach significantly simplifies the process of parallel program construction, since the programmer does not do parallel programming as such. He only describes the set of CFs and DFs, provides conventional sequential subroutines which implement CFs in C++, and that’s all. No programming of communications, synchronizations, memory management and other low-level details is required. However, the efficiency of execution of LuNA programs may be significantly lower, than that of manually developed program using conventional parallel programming means. That is caused by the fact that construction of an efficient parallel program from its high-level specification is algorithmically hard in general case. To help LuNA system to construct more efficient programs the programmer is provided with means to tune the construction process. The means are called recommendations and directives. Usage of the means can significantly increase the efficiency of the constructed program by supplying the system with the programmer’s insight on how he suggests to execute fragments. Such information includes hints on CFs and DFs distribution and redistribution to nodes, order of CFs execution, garbage collection directives, etc. In the paper an in-depth analysis of the considered application is provided to elaborate an efficient parallel implementation of the numerical algorithm in a multi-core distributed environment. Then an efficient conventional distributed program is developed and described in the paper. The program is developed using MPI and OpenMP. Then, a LuNA program is developed and optimized. The process of development and optimization of LuNA program is presented in the paper to allow reuse of the experience for future development of similar fragmented programs. Then the experimental study of the efficiency of the constructed programs is presented.
Fragmented programming, luna system, parallel programs construction automation, high performance computing, case study
Короткий адрес: https://sciup.org/143181000
IDR: 143181000 | DOI: 10.24412/2073-0667-2023-2-45-73