Fine-grained parallelism and higher core performance: advantages of vector dataflow processor
Автор: Dikarev Nikolay Ivanovich, Shabanov Boris Mikhaylovich, Shmelev Aleksandr Sergeyevich
Журнал: Программные системы: теория и приложения @programmnye-sistemy
Рубрика: Программное и аппаратное обеспечение распределенных и суперкомпьютерных систем
Статья в выпуске: 4 (43) т.10, 2019 года.
Бесплатный доступ
Currently, the reserves in increasing the performance of modern processors are almost exhausted. The stagnation is evidenced by the absence of growth, both the clock frequency and the number of instructions executed per clock, which determine the scalar performance of the processor core. In vector dataflow processor under development, processor core performance looks increased up to 256 flops per clock, which is eight times higher than the latest Intel Xeon processors due to a higher fraction of vector execution. We show that that vector dataflow processor has a higher ratio of real performance to peak on programs such as bitonic sorting, matrix multiplication, and 2D Stencil compared to the best traditional architecture processors.
Vector processor, dataflow architecture, shared-memory multiprocessor, performance evaluation
Короткий адрес: https://sciup.org/143169808
IDR: 143169808 | DOI: 10.25209/2079-3316-2019-10-4-201-217