Homepage

Research

Teaching

Publications

CV

Others


Contact Information
jjk12 @ cva.stanford.edu

Stanford University
Department of Electrical Engineering
Gates Computer Science Building
Room 212
Stanford, CA 94305
USA
Phone: +1 (650) 723-0948
Router Microarchitecture
Adaptive Routing in High-Radix Network
Cost-Efficient Topology

A high-radix router (e.g. radix 64 and higher)  that provides many narrow ports is more effective in converting pin bandwidth to reduced latency and reduced cost than the alternative of building a router with a few wide ports. The use of high-radix router is a drastic shift from previous generation networks which were mainly build using low-radix routers and utilized 2-D or 3-D torus topologies.  Existing microarchitecture for low-radix routers can not scale to high-radix routers and thus new, scalable router microarchitecture is needed for high-radix routers.  The Cray BlackWidow is one of the first networks to take advantage of high-radix routers.
The cost of an interconnection network is dominated by the long, expensive links.  High-Radix Clos can radix advantage of high-radix routers but since the traffic needs to be routed through a central stage switch, it results in a 2x increase in cost.  We propose and evaluate a new topology, called flattened butterfly, which reduces the cost and power of an interconnection network by removing this middle stage switch.  By using global adaptive routing, the load-balancing can be achieved on this new topology to achieve a 2x performance/cost on benign traffic and the same performance/cost on adversarial traffic pattern.
Adaptive routing can provide performance benefit over oblivious routing since it takes into account the state of the network (such as the queue depth) in making its routing decision.  Adaptive can be especially useful when non-minimal routing can increase the throughput of the network.  However, for the high-radix folded Clos network where all paths are minimal, the benefit of adaptive routing is not clear. We show the benefits of adaptive routing with transient load imbalance as well as nonuniformities and discuss different techniques to reduce the complexity of implementing adaptive routing.


Die photo of YARC: radix-64 router from Cray used in the BlackWidow network.
On-Chip Networks
With the increasing usage of multicore architecture, on-chip networks are needed.  The constraints of on-chip networks are different from the off-chip networks, thus results in the need to re-evaluate the appropriate architecture for on-chip networks.  However, some of the constraints such as the need for low-power and lower latency network is still the same.  We evaluate the usage of higher radix routers to on-chip networks and evaluate their benefits, compared to a conventional 2-D mesh networks.
Evolving semiconductor and circuit technology has greatly increased the pin bandwidth available to a router chip. In the early 90s, routers were limited to 10Gb/s of pin bandwidth. Today 1Tb/s is feasible, and we expect 20Tb/s of I/O bandwidth by 2010. The increasing bandwidth can be efficiently utilized to provide lower latency and lower cost by increasing the radix of the routers - thus, creating high-radix routers.
Research Overview