EE482C (EE482S) Advanced Computer Organization:
Stream Processor Architecture - Spring 2001/2002

Arithmetic is very inexpensive on a modern VLSI chip, making bandwidth the factor limiting system performance. In 0.13um CMOS, for example, a 64-bit floating-point arithmetic unit takes less than 1mm2 of chip area and consumes less than 10fJ of energy per operation. It is possible to put hundreds of such arithmetic units on a single chip and to operate them at 1GHz with a total power of less than a few Watts. The challenge is to supply these hundreds of arithmetic units with the TeraBytes/s of instruction and data bandwidth they require.

Stream processing has recently emerged as a method for optimizing the arithmetic to bandwidth ratio of a computation. Casting an application as a stream program, that is, as streams of data passing through computation kernels, exposes the parallelism and locality in the program. A stream architecture exploits the parallelism of the stream program using hundreds of arithmetic units and exploits the locality of the stream program using a bandwidth hierarchy. The bandwidth hierarchy significantly reduces demand for memory bandwidth and power consumption which in modern processors is mostly due to data movement. During Spring Quarter 2001/2002, EE482A (EE482S) will examine the architecture and programming of stream processors. The topics to be covered are listed in the schedule.

