Spring 2001/2002 EE482C Handout #1

EE482C Advanced Computer Organization:
Stream Processor Architecture
Course Policy

Room: Gates-100

TTh 4:15 to 5:30

Instructor: William J. Dally


Gates Room 301


Hours: TTh 3:00 to 4:00 (except as noted see announcements)

TA: Mattan Erez mattan.erez@stanford.edu

Gates Room 224

Hours: M 1:00 to 2:00, T 5:30 to 6:30, W 1:30 to 4:30

TA: Abhishek Das abhishek@cva.stanford.edu

Gates Room 255

Hours: Th 10:00 - 12:00, F 10:00 - 12:00, F 3:00 - 4:00
Support: Pamela Elliott

Gates 303


On-Line Info: available via http://www.stanford.edu/class/ee482)


This course will introduce students to current research topics in stream processor design which are designed to take advantage of modern VLSI processes. Arithmetic is very inexpensive on a modern VLSI chip, making bandwidth the factor limiting system performance. In 0.13um CMOS, for example, a 64-bit floating-point arithmetic unit takes less than 1mm2 of chip area and consumes less than 10fJ of energy per operation. It is possible to put hundreds of such arithmetic units on a single chip and to operate them at 1GHz with a total power of less than a few Watts. The challenge is to supply these hundreds of arithmetic units with the TeraBytes/s of instruction and data bandwidth they require.

Stream processing has recently emerged as a method for optimizing the arithmetic to bandwidth ratio of a computation. Casting an application as a stream program, that is, as streams of data passing through computation kernels, exposes the parallelism and locality in the program. A stream architecture exploits the parallelism of the stream program using hundreds of arithmetic units and exploits the locality of the stream program using a bandwidth hierarchy. The bandwidth hierarchy significantly reduces demand for memory bandwidth and power consumption which in modern processors is mostly due to data movement. During Spring Quarter 2001/2002, EE482A (EE482S) will examine the architecture and programming of stream processors. The topics to be covered are listed in the schedule. issues.


The course is organized as a combination of lectures and discussions with each class meeting allocated to a particular topic. Before each meeting, all students are expected to read a research paper describing recent research on that topic. One student in each meeting will be a scribe, and will be responsible for recording and writing up the discussion. There will be two written assignments and a final project with no exams.


Program a simple application for Imagine

Each student (or a pair of students) will be program a simple application targeted for the Imagine stream processor. This will involve:

  1. Casting the described algorithm into stream form.
  2. Understand the Imagine Programming system and code the algorithm.
  3. Simulate the application on the Imagine cycle-accurate simulator.
  4. Analyse the results and perform simple optimizations.

Program another application in a high-level stream language (Brook)

Follows similar steps as the first assignment but start with a high-level program before targetting Imagine.


The project will involve investigating some aspect of stream processor architecture. For example, you may develop and evaluate techniques for converting arbitrary control flow into the limited control flow allowed by streams. We suggest that the project investigate one of the areas that we will be discussing during class, but you are free to propose any research topic in this area. Project ideas will be listed here, and one meeting will be devoted to project idea brainstorming and discussion. Also, once you begin working on the project, some class time will be allocated for project updates and reviews. You may work on these projects in groups of up to four students. Each group will submit a detailed project proposal, a final report, and will present their work to the entire class.


Two assignments 30%

Scribing 10%

Discussion participation 20%

Project 40%

Text NonePrerequisites