Fine-grained parallelism and higher core performance: advantages of vector dataflow processor

Бесплатный доступ

Currently, the reserves in increasing the performance of modern processors are almost exhausted. The stagnation is evidenced by the absence of growth, both the clock frequency and the number of instructions executed per clock, which determine the scalar performance of the processor core. In vector dataflow processor under development, processor core performance looks increased up to 256 flops per clock, which is eight times higher than the latest Intel Xeon processors due to a higher fraction of vector execution. We show that that vector dataflow processor has a higher ratio of real performance to peak on programs such as bitonic sorting, matrix multiplication, and 2D Stencil compared to the best traditional architecture processors.

Еще

Vector processor, dataflow architecture, shared-memory multiprocessor, performance evaluation

Короткий адрес: https://sciup.org/143169808

IDR: 143169808   |   DOI: 10.25209/2079-3316-2019-10-4-201-217

Статья научная