Retiming is one of the most powerful sequential transformations that relocates the registers or flip-flops (FFs) in a circuit while preserving its functionality. Since relocating the FFs could balance the longest combinational paths and reduce the circuit states, the clock period and the FF area or number of FFs in a circuit can be reduced through retiming optimizations. As the minimum clock period retiming minimizes the clock period, and thus might significantly increase the FF area, the minimum area retiming minimizes the FF area under a given clock period, and thus could be used to minimize the FF area even under the minimum clock period. Therefore, the min-area retiming problem is more important for sequential circuit design, but of higher complexity. See, e.g. N. Shenoy and R. Rudell, “Efficient implementation of retiming” ICCAD, pages 226-233 (1994).
All known and provable approaches to min-area retiming follow the basic ideas of C. E. Leiserson and I. B. Saxe “Retiming synchronous circuitry,” Algorithmica 6(1):5-35 (1991). Given a circuit represented as a graph of n vertices and m edges, the minimum number of FFs between any two vertices and the maximum delay over the paths of the minimum number of FFs are first computed. Then, besides one constraint for each edge requiring that the FF number to be nonnegative, for each pair of the vertices whose computed path delay is larger than the given clock period, i.e. the timing critical path, a constraint is generated requiring that there be at least one FF between them. Minimizing the FF′ area under those constraints formulates a dual of the min-cost network flow problem. Since each constraint forms an arc in the flow network, the number of arcs in the network is usually θ(n2). Even though polynomially solvable, min-cost network flow computation, such as described in R. K. Ahuja, T. L. Magnanti, and J. B. Orlin “Network Flows: Theory, Algorithms, and Application,” Prentice Hall (1993), over a dense circuit graph is usually expensive on large problems.
N. Shenoy and R. Rudell, in “Efficient implementation of retiming,” ICCAD, pages 226-233 (1994), were among the first to consider a practical implementation of the min-area retiming algorithm. They found that the storage requirement to compute the timing critical paths and the number of constraints are the bottleneck and proposed techniques to reduce memory usage and to prune some redundant constraints. Minaret, proposed by N. Maheshwari and S S. Sapatnekar in “Efficient retiming of large circuits,” IEEE TVLSI, 6(1):74-83, (March 1998), further pruned redundant constraints to reduce the size of the flow network by exploring the equivalence of retiming and clock skew optimization as proposed in ASTRA. See S. S. Sapatnekar and R. B Deokar “Utilizing the retiming-skew equivalence in a practical algorithm for retiming large circuits,” IEEE TCAD, 15(10):1237-1248 (October 1996). However, even with these pruning techniques, as experimental results indicate, the flow networks could still be very dense compared to the original circuit graphs. Experiments have shown that for a circuit with more than 180K gates Minaret had to formulate and solve a minimum cost network flow problem with more than 122M arcs, which used up more than 2 GB of virtual memory.
H. Zhou, in “Deriving a new efficient algorithm for min-period retiming,” Asia and South Pacific Design Automation Conference, Shanghai, China (January 2005) proposed an efficient incremental algorithm for minimum period retiming that iteratively moves FFs to decrease the clock period while the optimal solution is found in a short time. To overcome the expenses of existing approaches to minimum area retiming, D. P. Singh, V. Manohararajah, and S D Brown, in “Incremental retiming for FPGA physical synthesis,” DAC, pages 433-438, Anaheim, Calif. (June 2005) also proposed that FFs be incrementally moved in the circuit. However, since only those moves that are better in cost and feasible in timing are allowed, these approaches are heuristic and may end up with a suboptimal solution. An efficient incremental algorithm for minimum area retiming with provably optimal solution has been evasive.