Studies on FPGA Placement and Routing (updated on Jan. 28)

by Yu Hu and Tong Jing

 

n       Incremental Placement

1.      Fundamental work for FPGA Placement in VPR

A.         Alexander (Sandy) Marquardt, Vaughn Betz, and Jonathan Rose1, Timing-Driven Placement for FPGAs, FPGA'00

2.      Incremental placement for retiming

A.         Deshanand P. Singh and Stephen D. Brown, Integrated Retiming and Placement for Field Programmable Gate Arrays, FPGA'02

B.         Deshanand P. Singh and Stephen D. Brown, Incremental Placement for LayoutDriven Optimizations on FPGAs, ICCAD'02

C.         Valavan Manohararajah, D. P. Singh and S. D. Brown. Incremental Retiming for FPGA Physical Synthesis, DAC'05

1.      Layout Driven Retiming Using the Coupled Edge Timing Model, TCAD'03

[Comments]

As I'm working on the integration of retiming and placement at present, I read some papers about FPGA placement and retiming. 1-A is a timing-driven placement algorithm based on simulated annealing, which has been used in VPR. To integrate retiming into placement, 2-A and 2-B aim to perform placement incrementally. A 3-step design flow is proposed, i) retiming-aware placement, ii) minimal placement disruptive retiming, iii) incremental clustering and placement. However, retiming is actually performed after placement as the so-called retiming-aware placement is to add the critical-cycle-ratio into the cost function and consider retiming indirectly in placement. After performing minimal disruptive retiming, an incremental clustering and placement approach is employed to handle the changed netlist and assign the newly added FFs into appropriate position. An important assumption in this work is that only one output of BLE can be used, which indicates that a FF must be associated with a LUT if the FF is used. To handle the newly added FFs, 2-A and 2-B try to move them to available BLE slots, which of course makes certain BLEs move out of its original cluster.

 

n       Adaptive placement for Routing-poor FPGA architecture

2.      Justification for considering routing-poor FPGA architecture

A.         Andrk DeHon, Balancing Interconnect and Computation in a Reconfigurable Computing Array (or, why you don't really want 100% LUT utilization), FPGA'99

B.         Kenneth Eguro and Scott Hauck, Issues of Wirelength Cost Models in Routing-Constrained FPGAs, UWEE Technical Report, Number UWEETR-2004-0006

3.      Adaptive placement algorithms for routing-poor FPGA architecture

A.         Ken Eguro, Scott Hauck and Akshay Sharma, Architecture-Adaptive Range Limit Windowing for Simulated Annealing FPGA Placement, DAC'05

B.         Akshay Sharma, Carl Ebeling and Scott Hauck, Architecture Adaptive Routability-Driven Placement for FPGAs, FPGA'05

C.         Source code and test bench, from Prof. Scott Hauck, U Washington

[Comments]

3-A and 3-B pointed out that "higher LUT usage does not imply lower area and LUT usability is not always directly correlated with area efficiency." The basic idea behind routing-poor architecture is increase the available logic elements while decreasing the routing resource (e.g routing track). As mentioned in 3-A, the interconnect resources consume most of the area on modern FPGAs (80%-90%), it is reasonable to consider how to increase the utilization of interconnect.

4-A and 4-B are two works trying to handle routing-poor architecture in placement. Basically, they employ a fast routing in the placement to achieve a more accurate routing estimation, which includes both wire length and congestion.

[The proposed idea]

Topic: FPGA architecture and circuit co-design for low power/area designs

Motivation: As mentioned above, there exists a tradeoff between the utilization of logic elements and routing tracks. Given a power/area model for logic elements and routing tracks, we have freedom to decide the FPGA architecture (e.g. CLB array size and routing channel width) to make sure of routability. In the same time, we may want to minimize the area/power of the design. This motivates us to study on the FPGA architecture and circuit co-design to find the best trade-off between utilization of logic elements and routing resource in terms of power/area. Note that the previous works just study on how to place/route a circuit on a given architecture. Also, we can study on the sensitivity of different logic elements and routing resource ratios to process variations.

 

n       Pipeline-aware Routing

4.      Fundamentals for interconnect pipelined FPGA architecture

A.         William Tsu, Kip Macy, Atul Joshi, Randy Huang, Norman Walker, Tony Tung, Omid Rowhani, Varghese George John Wawrzynek, and Andr6 DeHon, HSRA: High-Speed, Hierarchical Synchronous Reconfigurable Array, FPGA'99

B.         Deshanand P. Singh and Stephen D. Brown, The Case for Registered Routing Switches in Field Programmable Gate Arrays, FPGA'01

C.         Amit Singh, Arindam Mukherjee, Malgorzata Marek-Sadowska, Interconnect Pipelining in a Throughput-Intensive FPGA Architecture, FPGA'01

D.         Carl Ebeling, Darren C. Cronquist, and Paul Franklin, RaPiD - Reconfigurable Pipelined Datapath, FPL'06

5.      Fundamental algorithm for FPGA routing in VPR

E.          Larry McMurchie and Carl Ebeling, PathFinder: A Negotiation-Based Performance-Driven Router for FPGAs, FPGA95

6.      Pipeline-aware FPGA routing algorithms

A.         Song Li and Carl Ebeling, QuickRoute: A Fast Routing Algorithm for Pipelined Architecturest, ICFPT 2004

B.         Akshay Sharma, Carl Ebeling and Scott Hauck, PipeRoute: A Pipelining-Aware Router for FPGAs, Technical Report, UW

C.         Ken Eguro and Scott Hauck, Armada: Timing-Driven Pipeline-Aware Routing for FPGAs, FPGA'06

[Comments]

As pipelining or retiming can always introduce additional FFs in the interconnect in FPGA, it's a hard problem to find an appropriate location for these FFs. Additional delay (the local interconnect to go through several MUXs to reach the desired FF and another several MUXs to reach the CLB outputs) will be introduced if we split a pipelined interconnect into several segments for available FF slots in CLBs. Therefore, a so-called pipeline FPGA architecture is proposed, where a switch box can be configured to be a FF. So the pipelined interconnect can get enough FFs in the switch boxes without suffering from much additional local interconnect delay to find an available FF. Literature (5-A ~5-D) describes details of the pipelined FPGA architecture.

Pipeline-aware routing problem has been attacked in 7-A, 7-B and 7-C in both routability-driven and timing-driven way.

[The proposed idea]

Topic1: Glitch power minimization by pipeline-aware routing

Motivation: Glitch power contributes a big part in dynamic power, which can be reduced by decreasing the combinational path length between two pipeline stages. A pipeline-aware routing provides us such a possibility to uniform path length between two pipeline stages so that reduce the glitch power.

Topic2: Improve the existing pipeline-aware algorithms

Motivation: We can employ more accurate timing model in timing-driven pipeline-aware routing to further optimize delay. Also, we can consider pipeline-aware placement. Statistical pipeline-aware algorithms can also be studied on.