Does it make sense to check hold violations at synthesis stage

As we know from basics of STA, hold timing equation is generally of the form:

Hold_slack = Data_path_delay - hold_time_of_flop - clock_skew

Which can be re-organized as follows:

Hold_slack = Data_path_delay + launch_clock_delay - capture_clock_delay - hold_time_of_flop

Data_path_delay + launch_clock_delay may be combined to represent arrival of data at the capture flip-flop with respect to clock source. The same is evident from below figure.



The modified equation, then, becomes:
Hold_slack = Data_arrival_at_capture_flop - Clock_arrival_at_capture_flop - hold_time_of_flop
It is clear from above equation that hold checks generally comprise of a race condition between clock and data minus a fixed number (is hold_time_of_flop really a fixed number is a separate question).

Before clock-tree synthesis, we do not have one major aspect of this equation; i.e. we do not know the arrival times of clock at launch and capture flip-flops. Also, we do not know how much is the skew and uncommon clock path contributing to on-chip variations; thus, requiring extra margin for hold slack. Talking about logic synthesis, we do not even have placement data most of the times, so we do not even have correct data path delay estimates with us. So, fixing hold at synthesis will not help as there will anyways be requirement for hold fixes taking into account actual data and clock path delays.

But the most important reason for not fixing hold before clock-tree synthesis is that the task of hold fixing is not much complex for the tools available. It may be as simple as downsizing logic and/or adding buffers in data path where setup slack is available. So, during synthesis and logic placement, tools focus on getting setup targets met and focus on hold after clock tree is built.

Another point to note is that after data nets routing, data path delay may increase because of detouring of data nets as compared to originally estimated. So, it may be wise not to fix hold violations of very small magnitude even after clock-tree synthesis and wait for data nets routing to see how many of those violations are still there.

4:1 mux as universal gate

A universal gate is a gate which can implement any given logic function. NAND and NOR gates are basically known as universal gates, since you can implement any logic function with these. A multiplexer, in a sense, can also be termed as a universal gate, since, you can realize any function by using a mux as a look-up-table structure. In this post, we discuss how we can utilize a 4:1 mux as a universal gate realizing 2-input gates.

Any two-input gate gives a definite value (either 0 or 1) for all the combinations of its inputs and can be represented in the form of truth table as shown in table below.


Here, A,B,C & D can be either "0" or "1" depending upon the functionality of the gate. For instance, for a 2-input AND gate, A = B = C = 0 and D = 1.

Utilizing a 4-input mux for implementing this generic 2-input gate, we can implement as shown below:


For instance, 2-input AND gate will be implemented as following:


This post was written as a response to a query from one of our readers. You can also post your query at post your query.