RTL Synthesizer Project Meeting Notes (09/27/07) 1. What're missing in the current implementation? * Flattening of module hierachies * Inference for memory blocks and other architecture-dependent structures, such as tri-state buffers, carry chains, small logic gates * Handler for "loop" statement in Verilog language * Handler for dynamic array (e.g., A = B[C], where C is a varaible) in Verilog language * Optimization 2. Verification issues for DSP * Methodology for DSP verification: implement a RTL description of Xilinx DSP48 by Verilog, and declare the instants of the DSP48 in the synthesized circuit, then verify against the original Verilog by Formality. * Problem during DSP verification: certain unused outputs of the registers in the DSP48 instants are forced to be verified and it causes verification failure. * Solution of the above problem: > i. Suggested by Jerry, Formality has problems to handle registers inside the hierachy and users are required to tell the tool not to match those points. In addition, Formality has a limit to verify large data-path, e.g., wide multipliers, which has a huge fanin cone w.r.t. a output node, since the essence of Formality verification is to build BDDs for each matching point and the size of the BDDs cannot be too large. Basically, when the input number of the multiplier is larger than 8, Formality will fail due to extensive memory or runtime. In this case, the solution to verify DSP blocks is to use simulation based approach. Usually, 10K simulation vectors should be sufficient for a medium size circuit. Note that the bound vector, e.g., 0000 and 1111 of a 4-input vector should be included in the simulation vectors. One recommanded simulator is Cadance ModSim, which is available in UCLA. > ii.Jerry will take a look at our Verilog implementation of the DSP48 block and will feed back to me if anything is wrong with it, we will resume formal verification when we figure out where the problem is. 3. Mapping for small logic gates and carry chain * Clarify of the effect of the small logic gates in Xilinx V4 FPGA: for medium size circuits, the number of MUX2 and XOR2 gates is over 10x of LUT-4s in the critical paths; for larger circuits, the MUX2 and XOR2 number is still comparable to that of LUT-4s. * According to Jerry, the small logic gates (XOR2 and MUX2) in Xilixn V4 only serve for carry chain. The inference of carry chain is either caused by adders or wide AND gates, and only when carry chains are inferred, XOR2 and MUX2 gates are used to connect them. The random logics are usually not mapped to XOR2 and MUX2. There is no technical papers talking about the carry chain inference and industrial people use ad-hoc heuristic for carry chain. 4. Tasks for the next move * Yu: i) implement flattening routines ii) literature search for FPGA Floorplanning and low power FPGA synthesis * Zhipeng: i) setup a simulation based verification experimental platform for DSP verification ii) keep testing larger (more complex) benchmarks, verification, report bugs and debug * Zhe: i) read Yu's elaboration code...