Fused multiply-adders using in vector dataflow processor
Автор: Dikarev Nikolay Ivanovich, Shabanov Boris Mikhaylovich, Shmelv Aleksandr Sergeyevich
Журнал: Программные системы: теория и приложения @programmnye-sistemy
Рубрика: Искусственный интеллект, интеллектуальные системы, нейронные сети
Статья в выпуске: 4 (27) т.6, 2015 года.
Бесплатный доступ
A processor with a flow control architecture can perform up to 16 instructions per cycle compared to 4-6 instructions in time with the best Von Neumann architecture processors. Simulation of the vector stream processor showed that its performance on the matrix multiplication program can be brought up to 256 flops per clock with the output of less than 8 instructions per clock, and maintained close to peak performance with a much smaller size of the processed matrices. The advantages and disadvantages of using in this processor on vector processing a pipeline "doubled" multiplier and adder are used instead of separate multipliers and adders with floating point. Key words and phrases: supercomputer, vector processor, flow control architecture, performance evaluation, fine-grained parallelism, dual arithmetic
Короткий адрес: https://sciup.org/14336167
IDR: 14336167