Stanford Concurrent VLSI Architecture Group

Stanford Concurrent VLSI Architecture Group

The Stanford Concurrent VLSI Architecture (CVA) Group investigates methods for applying VLSI technology to information processing problems. Our current group goals are:

High Speed Signalling: We are working to develop methods and circuits to enable high-speed, low-energy signaling between CMOS integrated circuits. Using 0.5 micron CMOS technology we expect to achieve signaling rates of 4Gb/s over a 3m differential pair between two VLSI chips. Our approach overcomes the limitation of present-day signaling techniques by using pre-emphasis and equalization to compensate for the frequency-dependent attenuation enabling operation at frequencies well above the 3dB point of the line. We also employ a number of novel methods to reduce timing uncertainty and cancel noise. By overcoming a number of fundamental problems in signaling, we expect our methods to enable continued scaling of signaling rates with improvements in IC process technology.

The Imagine Image Rendering Engine: Today's image rendering hardware operates at a very low level, with the simple polygon as the primitive data type. Raising the abstraction barrier will allow faster imaging of more complex scenes, as well as reducing the data storage requirements. By exploiting the efficiency, ease of modelling, and level-of-detail advantages of image based rendering and the high performance (16.2 GFlops) of the Imagine architecture, the Imagine single chip processor, with a modest amount of external RAM, will perform high quality (1024x768), real-time (30 frames/sec) animation of complex, realistic scenes in support of applications such as flight simulation, distributed battlefield simulation, walkthroughs of virtual buildings and vehicles, and visualization of terrain databases. The Imagine architecture will also provide order of magnitude performance improvements on other image and signal processing applications, including synthetic aperture radar. We expect Imagine to serve as a model for future generations of commercial signal and image processors.

The M-Machine: The M-Machine project is developing computer architecture technology that more efficiently exploits increased circuit density. The M-Machine is a parallel supercomputer based on custom multi-ALU processing nodes (MAP chips). The hardware effort focuses on implementing the following:

The high level design of the M-Machine is complete and the silicon implementation is underway, in conjunction with Cadence Spectrum Design (CSD).

M-Machine Compilation: In the spirit of VLIW architectures, the M-Machine requires some aspects of synchronization and scheduling to be performed by the compiler. Research into managing register synchronization across a partioned register file has been performed in conjunction with the Scalable Concurrent Programming Laboratory at Syracuse (formerly Caltech). Current work seeks to develop instruction-level parallelism techniques to partition programs across the multiple clusters of the MAP, and to optimize synchronization among and communication between these clusters. A runtime system and compiler targetted for the M-Machine are currently under construction.

Last updated: May 30, 1998
Technical contact: William J. Dally (
Web contact: Ujval Kapasi (