We have done the following works in this area.
1. Efficient FPGA Mapping with Retiming for Clock Period Minimization
2. FPGA Synthesis with Retiming for Clock Period Minimization
3. FPGA Mapping with Retiming for Sequential Circuits with Given
Initial States
Current Work:
1. Efficient FPGA Mapping with Area-Depth Tradeoff
2. FPGA Mapping with Retiming for Real Designs with:
initial states
(asynchronous) clock enables
multiple clocks
3. FPGA Mapping with Retiming with Interconnect Delay
Summary of Finished Work:
Conventional FPGA mapping algorithms are designed for combinational
circuits only. For sequential circuits, one needs to cut off all the flipflops
and maps each combinational subcircuit separately and then connects the
mapping solutions with the original flipflops to form a mapping solution.
This kind of approaches usually cannot lead to the optimal solutions for
sequential circuits under retiming. Retiming is a powerful technique for
clock period minimization by repositioning flipflops. With retiming, we
can enlarge the mapping solution space significantly and potentially achieve
much better results.
In usual case, retiming and FPGA mapping are performed separately.
However, the serious problem for this separated approach is that retiming
cannot change the circuit structure and its results are limited by mapping.
On the otherhand, the current mapping algorithms do not know how good a
mapping solution will be after retiming. It is expected that we can achieve
the best possible results by combining retiming and mapping. Along this
line, our first work is an efficient FPGA mapping with retiming algorithm
for clock period minimization. It constitutes a significant improvement
on runtime over the previous work on optimal FPGA mapping with retiming
proposed by Prof. Pan and Liu. With guaranteed optimality of the results,
our algorithm, named TurboMap, is over 1000 times faster than the algorithm
by Pan and Liu. To further reduce the clock period, we combine the BDD-based
functional decomposition with mapping and retiming to form a new algorithm,
named TurboSYN. TurboSYN can reduce the clock period of optimal mapping
with retiming by half in average. Comparing the results by FlowSYN (depth
minimal mapping with resynthesis) followed by separated retiming, TurbSYN
can also achieve results with 41% smaller clock periods.
For sequential circuits with given initial states, our recent result
is an optimal FPGA mapping with forward retiming algorithm with guaranteed
new initial state computation. The motivation of our work due to the fact
that retiming will change the initial state of the original circuit and
it is NP-hard to decide if there exists an equivalent initial state for
a retiming and if there does exist one, how to compute an equivalent one.
Since the new initial state for forward retiming (always moving flipflops
from the inputs of a gate to the outputs) can be computed efficiently in
linear time, we separate a general retiming into forward retiming and backward
time (always moving flipflops from the outputs of a gate to the inputs)
and combine the mapping with forward retiming. The backward retiming can
be performed as a preprocessing to push the flipflops to the primary inputs
with the only consideration of whether an equivalent initial state can
be computed efficiently. In fact, just performing mapping with forward
retiming by our algorithm, named TurboMap-frt, we can reduce the clock
period by 17% comparing with the conventional approach of depth minimal
mapping for each combinational subcircuits followed by separated forward
retiming. On the otherhand, the clock period by TurboMap-frt is only 3%
larger than that by TurboMap of optimal mapping with general retiming,
while many of results by TurboMap cannot compute an equivalent initial
state in reasonable amount of time.
Brief Description of Current Work:
1. Efficient FPGA Mapping with Area-Depth Tradeoff and for Heterogeneous
FPGAs
Most of the LUT-based FPGA mapping algorithms are for homogeneous FPGAswith
LUTs of uniform size and for either area or depth minimization. Although
there are some algorithms considered the area-depth tradeoff, the capability
of reducing area by relaxing depth is limited and most of all, they cannot
guarantee to reduce the area by relaxing depth. We plan to propose new
algorithms with much better area-depth tradeoff capability and for heterogeneous
FPGAs with LUTs of different sizes.
2. FPGA Mapping with Retiming for Real Designs with:
initial states
(asynchronous) clock enables
multiple clocks
We have proposed a good heuristic for mapping with retiming for designs
with given initial states. However, the problem of mapping with retiming
for designs with initial states and clock enables and multiple clocks remain
open. We plan to investigate the problem.
3. FPGA Mapping with Retiming with Interconnect Delay
Most the existing performance-driven FPGA mapping algorithms assume
unit-delay model. As the FPGA technology goes into deep-submicron, the
interconnect delay becomes more and more important. We plan to investigate
the problem by considering either more accurate delay model or combining
mapping with placement.
|
|
|