ELM Circuits

Circuit development in the ELM research group is focused on reducing energy by complimenting standard cell design with custom circuits. By exploiting knowledge about the architecture and software, custom circuits will be spefically targeted the ELM systems that consume the most energy. The perspective gained at the circuit level will also provide feedback about the best way to structure code for minimum energy dissipation. The most attention will be paid to the following areas:

Specialization

When designing custom circuits for the Ensemble Processors (EPs), one benefit over standard cells is specialization. By running benchmarks on the placed and routed EP, we are able to pinpoint which structures and cell combinations in the chip dissipate the most energy. For example, many of the flops in the system had inputs and outputs from/into multiplexors, accounting for about 2-5% of our total EP energy. By combining this structure into one custom cell, we estimate that this energy can be halved. Other examples include designing one 32-bit register that dissipates less energy then thirty-two 1-bit flip-flops.

Signaling

The energy needed to communicate a bit across a wire is proportional to the square of the voltage. Utilization this energy reduction through use of low swing signaling will reduce wire energy dissipation while still maintaining high performance. One of the drawbacks of low swing signaling, however, is that it requires a differential wire infrastructure and the fixed energy overhead of sensing. Signals that are driven a long distance and have been targeted to become low-swing include the write-back bus and communication buses.

Memories

Much of the architectural gains in the ELM system rely on the being able to do low energy reads and writes into small SRAMs. Design efforts are taking place in order to make access energies as low as possible. We will explore a variety of options here since we have SRAM structures in our project ranging from 128 bits to kilo-bits. Areas of study include a low-swing write topology and a hierarchical register file.