Paper Review on Technology Mapping
Commented by Yan
Last update:
05/08/2006
1. FlowMap, "FlowMap: An Optimal Technology
Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs",
Jason Cong and Yuzheng Ding, TCAD 1999 (pdf)
The algorithm consists of labeling phase and mapping phase. The cut with optimal
depth is generated by formulating the problem as a max flow min cut problem.
Area is heuristically minimized by finding the max volume min-cut and further
reduced using post-processing techniques.
2. CutMap, "Simultaneous Depth and Area
Minimization in LUT-Based FPGA Mapping", J. Cong and Y. Hwang, ISFPGA'95 (pdf)
The algorithm is developed based on FlowMap. In the labeling phase, the optimal
depth in LUT-mapped solution is calculated for each node. In the mapping phase,
the min-height cut is calculated for nodes on critical path while the min-cost
K-feasible is calculated for nodes on non-critical path. A heuristic is
presented to speed-up cut enumeration. Compared to FlowMap, number of LUTs
(area) is reduced by 15%.
3. Efficient cut generation and pruning, "Cut
ranking and pruning: enabling a general and efficient FPGA mapping solution", J.
Cong and et al, ISFPGA'99 (pdf)
An efficient cut enumeration method is presented. The method is to recursively
generate cuts for a node based on the cuts of its fan-in nodes. This method has
been used in various state-of-the-art mappers such as [4][5][6][7].
4. Placement driven mapping, "placement-driven
technology mapping for LUT-based FPGAs", J. Lin and et al, ISFPGA'03 (pdf)
Given a placement solution and delay models from placement, another mapping
solution with placement is generated. Cut generation is based on the mathod in
[3]. Delay is updated using look-up table in
placement. The algorithm is in iterative fashion and includes labeling and
mapping phases. The placement solution may be invalid and is legalized through
local refinement including greedy movement and low temperature simulated
annealing method.
5. EMap, power minimization, "On the interaction
between power-aware CAD algorithms for FPGAs", S. Wilton and et al, ICCAD'03 (pdf)
The algorithm is developed based on CutMap. Similar to CutMap, the min-height
cut is selected for critical nodes but min-cost cut is selected for non-critical
nodes. Different from CutMap that only considers area cost, power cost is
considered along with area cost as a weighted sum. Nodes with larger activities
are absorbed into an LUT. Compared to CutMap, an average of 8% energy saving is
achieved. There is virtually no area overhead.
6. DAOMap, "DAOMap, a depth-optimal area
optimization mapping algorithm for FPGA designs", D. Chen and J. Cong, ICCAD'04
(pdf)
The mapper is an incremental work based on previous work from the same group
(i.e. cutmap[2] and cut enumeration technique [3]). Surprisingly, one key
area estimation cost function is almost the same with that in IMap [7]. Besides
of various heuristics presented in this paper, DAOMap and IMap achieves similar
area and therefore the area estimation may be dominant in the cost function.
7. IMap, "heuristics for area minimization in
LUT-based FPGA technology mapping", V. Manohararajah and et al, IWLS'04, (pdf)
The concept of area flow is presented in a recursive way and is proved to be
equal to area in non-duplication mapping. In duplication mapping, area flow is
updated iteratively to catch the real area and is used to minimize area. Depth
is optimized. Compared to CutMap, area is reduced by 13% on average with LUT
size of 4. Cuts are enumerated exhaustively, which work fairly well with LUT
size less than 5.
8. Adaptive logic module (ALM), "Improving FPGA
performance and area using an adaptive logic module", M. Hutton and et al,
FPL'04 (pdf)
This paper more focuses on the structure of ALM instead of mapping algorithm.
The combinations of an ALM is shown in the following table.
The mapping algorithm is only briefly introduced in a pretty high level due to
the confidential information of Altera. The mapper needs to consider number of
ALMs instead of number of LUTs, in other words, the packability of a pair of
LUTs into an ALM. Verified with large industry designs, ALM reduces area and
delay by 15% and 12%, respectively.