University of California Los Angles
EDA Laboratory






 

 

 

 

 

FPGA Circuits, Architectures and CAD Algorithms for Power Efficiency and Process Variations 

Primary Investor (PI)

·           Prof. Lei He

Attended Students

·           Lerong Cheng

·           Fei Li

·           Weiping Liao

·           Yan Lin

·           Phoebe Wang

Funding sources

Research Outcomes

The configurability at the circuit level, particularly the field programmable gate array (FPGA) attracts a great deal of attention due to its short time to market and low non-recurring costs. However, FPGA consumes much higher power compared to custom designed circuits. My group started working on FPGA power modeling and reduction in 2002, and has published over 10 papers at most prominent conferences and journals since then. Our contributions in the area include:

 

·         Power modeling and characteristics of existing FPGA architecture. In a joint work with Prof. Jason Cong and our students Fei Li and Deming Chen, we developed a mixed level cycle-accurate FPGA power simulator and studied the power characteristics of existing FPGA architecture in ISFPGA’03 (IEEE International Sym. on FPGA) [1]. We showed that interconnect power is dominant and leakage power has a growing significance for the existing FPGA architecture, and both should be focused for power reduction. In addition, my students Yan Lin and Fei Li and I further improved the accuracy of power simulation in ISFPGA’05 [2] and showed that for existing FPGA architecture, the min-area architecture also has the minimal energy but has the largest delay. Tuning the LUT (lookup table) and cluster sizes lead to the high-performance architecture with 0.7x delay but 2.3x energy compared to the min-energy architecture. The aforementioned results have been accepted for publication by TCAD [3].

 

·         Field programmable power supply for power reduction. To reduce power, my students Fei Li and Yan Lin and I have shown at ISFPGA’04 [4] that the fixed dual-Vt patterns are effective to reduce power, but the fixed dual-Vdd patterns similar to those in ASIC may lead to extra placement constraints and excessive long interconnects and therefore they are not effective to reduce energy in FPGA. To compensate this, we introduced the concept of field programmable Vdd (including Vdd-level selection and power-gating) for power and timing co-optimization, and designed FPGA circuits and fabrics to provide field programmability of power supply for both logic blocks and interconnects. This concept of programmable supply voltage provides an extra dimension of field programmability that was never studied for FPGA and reduces FPGA energy-delay product by 29% compared to single-Vdd FPGAs with the Vdd level suggested by ITRS. This work was presented in DAC’04 [5] and ICCAD’04 [6] considering logic blocks and interconnects respectively, and a journal paper has been submitted to TCAD [7].

 

·         Architecture Evaluation with Vdd programmability. The Vdd-programmable FPGA circuits in [5, 6] may increase the number of SRAM cells for configuration by over 100%. We designed novel circuits to obtain field programmable Vdd gating (or Vdd selection) with negligible SRAM cell increase in the chip level, or obtain field programmable Vdd selection and Vdd gating with 28% SRAM cell increase in the chip level. We conducted a comprehensive FPGA architecture evaluation and drew the following conclusions: (i) dual-Vdd selection for logic clusters plus power gating without dual-Vdd for interconnects is the best architecture choice considering timing, area, and power, and it reduces the power by 2x at about 17% area overhead but with virtually no SRAM cell increase; (ii) compared to the existing area-optimal architecture, the power-optimal architecture has the same LUT size of 4; and (iii) LUT size of 7 leads to minimal delay no matter whether Vdd-programmability is used or not. This work was presented in ISFPGA’05 [2] and its full version paper has been accepted by TVLSI [8].

 

·         Device and Architecture Co-optimization. We conducted the first published co- optimization of device (Vdd, Vt and sleep transistor size) and FPGA architecture. We developed a trace-based FPGA power and timing evaluation method. It is orders of magnitude faster than cycle-accurate simulation [1] and enables effective exploration in the huge solution space considering architecture and device co-optimization. Compared to the baseline case similar to the state-of-the-art industrial FPGA architecture followed by device tuning, our co-optimization can reduce energy-delay product by 18.4% and chip area by 23% without power gating, or reduce energy-delay product by 58% with a 8.3% chip area increase due to sleep transistors for power gating. This work was presented in DAC’05 [9] and its full-version paper has been submitted to TCAD [10]. Very recently, we incorporated process variations in device and architecture co-optimization and concluded that LUT size 4 gives the highest leakage yield and LUT size 7 gives the highest timing yield. Considering both leakage and timing limits, LUT size 5 achieves the maximum combined leakage and timing yield.

 

·         Dual-Vdd interconnects with chip-level timing slack allocation. Our earlier work [6] assumed that Vdd-level converters are used inside dual-Vdd interconnects. These converters lead to an extra power overhead and decrease the power reduction. Recent papers from other research groups proposed to combine level converters with existing routing buffers, which still have a high power overhead, or to enforce all interconnects driven by a same logic block to use a same Vdd level, which removes the need of level converters but reduces the chance of dynamic power reduction. In a paper to appear at DAC’05 [C69], we proposed that dual-Vdd can be applied inside a routing tree with no Vdd-level converter if we only allow high-Vdd buffers driving low-Vdd buffers. This new formulation reduces power significantly compared to all existing dual-Vdd FPGA interconnects (65% power reduction compared to [6]). We also developed a chip-level time slack allocation algorithm using linear programming to maximize power reduction by dual-Vdd buffers. The linear programming is enabled by a closed-form relationship between slack and low-Vdd buffers that can be used by a routing tree. The full-version paper has been submitted for the second round review by TCAD [11].

References

[1] F. Li, D. Chen, L. He and J. Cong, "Architecture Evaluation for Power Efficient FPGAs", ACM International Symposium on Field Programmable Gate Array, 175-184, February 2003. (pdf)  

[2] Y. Lin, F. Li and L. He, "Power modeling and architecture evaluation for FPGA with novel circuits for Vdd programmability", the Thirteenth International Symposium on Field Programmable Gate Arrays, pp. 199-207, Feb. 2005. (pdf)

[3] Fei Li and Lei He, "Power Modeling and Characteristics of Field Programmable Gate Arrays", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 13 pages, October 2005. (pdf).

[4] F. Li, Y. Lin, L. He and J. Cong, "Low-power FPGD. Chen, L. He and J. Cong, "Architecture Evaluation for Power Efficient FPGAs", ACM International Symposium on Field Programmable GA using Dual-Vdd/Dual-Vt Techniques", the Twelfth International Symposium on Field Programmable Gate Arrays, pages: 42-50, February 2004. (pdf)

[5] F. Li, Y. Lin and L. He, "FPGA Power Reduction Using Configurable Dual-Vdd", IEEE/ACM Design Automation Conference, pp. 735-740, June 2004. (pdf)

[6] F. Li, Y. Lin and L. He, "Vdd Programmability to Reduce FPGA Interconnect Power", IEEE/ACM International Conference on Computer-Aided Design, pp. 760-765, San Jose, Nov. 2004. (pdf)

[7] F. Li, Y. Lin and L. He, "Field Programmability of Supply Voltage for FPGA Power Reduction", submitted to IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, March 2005. (pdf).

[8] Yan Lin, Fei Li and Lei He, "Circuits and Architectures for Field Programmable Gate Array with Configurable Supply Voltage", accepted by IEEE Transactions on Very Large Scale Integration Systems, 13 pages. (pdf).

[9] Y. Lin, and L. He, "Leakage efficient chip-level dual-vdd assignment with time slack allocation for FPGA power reduction", Design Automation Conference, pp. 720-725, June 2005. (pdf, ppt).

[10] L. Cheng, P. Wong, F. Li, Y. Lin and L. He, "Device and Architecture Co-optimization for FPGA Power Reduction", submitted to IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, May 2005. (pdf).

[11] J. Chen and L. He, "Modeling and Synthesis of Multi-Port Lossy Transmission Line for Multi-Channel Interconnect," accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 10 pages. (pdf).

 



Send your comments to our webmaster.
Last update: 
10-11-2002.