: VLSI n EDA

Metastability

What is metastability: Literally speaking, metastable state refers to a state "which is not so much stable" and a slight disturbance will cause the system to lose state. In the context of VLSI, specifically sequential design, in addition to two stable states "0" and "1", there is also a state in-between at which the output may wander for some time due to inherent feedback design of a latch. This state is known as meta-stable state and the phenomenon is known as metastability. In order to understand this, let us study two back-to-back connected inverters as shown in figure below.

Figure 1: Inverter loop schematic

When the output (N2) of top inverter is 1, M4 & M1 are on. Simlilary, when output is 0, M3 & M2 are on. If the voltage of output inverter is left at anything other than VDD or GND, M4 and M1 try to pull the output towards 1 and the other two pull it towards 0. The final settlement value (either 0 or 1) is dependent upon whose initial pull is stronger. However, if the initial pull combined of M4 & M1 is equal to M3 & M1, the output will not move. The output may or may not remain at this level for some time until equilibrium is maintained. However, if forcefully, we change the output by even a small value, it will settle at 0 or 1 depending upon the direction of forced change. This state is the so-called metastable state. The inverter pair can remain in this state as long as there is no disturbance in the voltage levels. This disturbance can be due to external factors such as external forced voltage or internal factors such as crosstalk. So, if it is left to come out of metastability by itself, the time to come out of metastability is unknown. It depends upon:

The value of voltage stimulus: If the level of disturbance is greater than a certain threshold, the output will start to move towards one of the stable levels. Of course, larger the value of initial disturbance, faster will be time to stability

Strengths of inverters to pull towards "0" and "1": The less voltage resolution capability the inverter pair has, it will be able to come of metastability in shorter time

How can we generate a pulse for every edge of the incoming pulse

It is a very common requirement to detect either positive edge, negative edge or both edges of a signal. And the circuit that can detect and generate a single cycle pulse is quite simple. In this post, we will discuss how we can detect positive edge, negative edge and both edges of a signal.

Detect positive edge of a signal: Positive edge of a signal means that the current state of the signal is "1" and previous state is "0". And a pulse signal means that the output of the circuit is "1" for one cycle. So, we need a circuit which generates "1" as output when present state is "1" and previous state is "0". It generates "0" as output otherwise.

Thus, output = D(n-2)' & D(n-1)

The required implementation is shown in figure 1 below:

Figure 1: Detection of positive edge of signal

Detect negative edge of a signal: Negative edge of signal means that the current state of the signal is "0" and previous state is "0". So, we need a circuit which generates "1" as output when present state is "0" and previous state is "1". It generates "0" as output otherwise.

Thus, Neg_edge_detect = D(n-2) & D(n-1)'

The required implementation is shown in figure 2 below:

Figure 2: Detection of negative edge of signal

Detecting both positive and negative edges of the signal: Simply doing an "OR" operation of Pos_edge_detect and Neg_edge_detect signal will produce an output which is a single cycle pulse for any of the edge of incoming signal. The requirement for consecutive edges of incoming signal is to be at least 2 cycles apart otherwise, the output will not be pulse, but will be a continuous signal.

Any_edge_detect = Pos_edge_detect + Neg_edge_detect

The required implementation is shown in figure 3 below:

Figure 3: Detection of both positive and negative edges of signal

The technique we discussed here delays the output by two cycles. Can you think of any other way to detect the edges of a signal which is more efficient?

Interview questions related to clock jitter and duty cycle variations

Below we list few of our posts related to clock jitter and duty cycle variation. Happy learning.

Clock jitter: Disusses the definition and types of clock jitter.

Which type of jitter matters for timing slack calculation?

Can jitter in clock effect setup and hold violations? : Discusses cases of setup and hold slack calculation where jitter in clock path can play a role

Duty cycle of clock: Discusses the definition of duty cycle and how it impacts timing slack of timing paths.

Duty cycle variation: Discusses in detail basics of duty cycle variation and its timing implications.

Duty cycle variation of inter-clock timing paths: Discusses the implication of duty cycle variation in case of root (master) clock to generated clock paths and inter-generated clock paths.

Duty cycle degradation: Discusses the factors responsible for duty cycle degradation

How to fix min pulse width violation: Discusses a few techniques to tackle min-pulse-width violations.

Duty cycle care-abouts for clock paths in reset assertion - Discusses how reset assertion can alter the duty cycle of clock, and what needs to be taken care of.

Our purpose is to make this page a single destination for any questions related to clock jitter and duty cycle variation. Please feel free to ask any question related to clock jitter and duty cycle variations.

Interview questions related to reset design and reset timing

Below we list few of our posts related to reset timing and some design concepts related to reset. Happy learning.

Reset basics - Discusses the purpose and design strategies related to reset

Synchronous and asynchronous resets - Discusses the basics of synchronous reset and asynchronous reset. Also discusses few differences between them

Reset synchronizer - Discusses the definition, need and working of reset synchronizers.

Recovery and removal checks - Discusses the timing aspects of asynchronous reset. Provides the definition of recovery check, removal check, recovery time and removal time.

Asynchronous reset assertion timing scenarios - Discusses if there may arise a need to time the assertion of asynchronous resets

Duty cycle care-abouts for clock paths in reset assertion - Discusses how reset assertion can alter the duty cycle of clock, and what needs to be taken care of.

Our purpose is to make this page a single destination for any questions related to reset design and timing. If you have any source of educational information related to reset, please comment or send an email to myblogvlsiuniverse@gmail.com and we will add it here. Also, feel free to ask any question related to reset design and timing.

Does it make sense to check hold violations at synthesis stage

As we know from basics of STA, hold timing equation is generally of the form:

Hold_slack = Data_path_delay - hold_time_of_flop - clock_skew

Which can be re-organized as follows:

Hold_slack = Data_path_delay + launch_clock_delay - capture_clock_delay - hold_time_of_flop

Data_path_delay + launch_clock_delay may be combined to represent arrival of data at the capture flip-flop with respect to clock source. The same is evident from below figure.

The modified equation, then, becomes:

Hold_slack = Data_arrival_at_capture_flop - Clock_arrival_at_capture_flop - hold_time_of_flop

It is clear from above equation that hold checks generally comprise of a race condition between clock and data minus a fixed number (is hold_time_of_flop really a fixed number is a separate question).

Before clock-tree synthesis, we do not have one major aspect of this equation; i.e. we do not know the arrival times of clock at launch and capture flip-flops. Also, we do not know how much is the skew and uncommon clock path contributing to on-chip variations; thus, requiring extra margin for hold slack. Talking about logic synthesis, we do not even have placement data most of the times, so we do not even have correct data path delay estimates with us. So, fixing hold at synthesis will not help as there will anyways be requirement for hold fixes taking into account actual data and clock path delays.

But the most important reason for not fixing hold before clock-tree synthesis is that the task of hold fixing is not much complex for the tools available. It may be as simple as downsizing logic and/or adding buffers in data path where setup slack is available. So, during synthesis and logic placement, tools focus on getting setup targets met and focus on hold after clock tree is built.

Another point to note is that after data nets routing, data path delay may increase because of detouring of data nets as compared to originally estimated. So, it may be wise not to fix hold violations of very small magnitude even after clock-tree synthesis and wait for data nets routing to see how many of those violations are still there.

4:1 mux as universal gate

A universal gate is a gate which can implement any given logic function. NAND and NOR gates are basically known as universal gates, since you can implement any logic function with these. A multiplexer, in a sense, can also be termed as a universal gate, since, you can realize any function by using a mux as a look-up-table structure. In this post, we discuss how we can utilize a 4:1 mux as a universal gate realizing 2-input gates.

Any two-input gate gives a definite value (either 0 or 1) for all the combinations of its inputs and can be represented in the form of truth table as shown in table below.

Here, A,B,C & D can be either "0" or "1" depending upon the functionality of the gate. For instance, for a 2-input AND gate, A = B = C = 0 and D = 1.

Utilizing a 4-input mux for implementing this generic 2-input gate, we can implement as shown below:

For instance, 2-input AND gate will be implemented as following:

This post was written as a response to a query from one of our readers. You can also post your query at post your query.

Duty cycle care-abouts for clock paths in reset assertion

In the post Asynchronous reset assertion timing scenarios, we discussed how we may need to time the assertion of asynchronous reset as well. In this post, we will talk about how important it is to talk about the duty cycle aspect as well. We will use the same example as we did in our previous post to help make a better connection.

In the figure below (discussed in Asynchronous reset assertion timing scenarios), Q0 goes from 0 ->1 and 1 -> 0 in the same cycle, thereby providing a very diminished high pulse, or very diminished low pulse to BIT_1 flip-flop depending upon the delay in reset path of BIT_0 flip-flop. This can result in violation of minimum pulse width requirement of BIT_1. We will discuss this in some detail in this post.

Let us talk only about clock edge number 4. Q0 goes 1 and BIT_1 receives it as positive edge of clock, thereby changing state too. The delay till BIT_1 receiving positive edge of clock is given as

POS_CK_AT_BIT_1 = (Latency of BIT_0) + (CLK->Q of BIT_0) + (BIT_0/Q -> BIT_1/CK)

Now, both Q0 and Q1 go 1, thereby causing the output of NAND gate going "0", which asserts the reset of both BIT_0 and BIT_1. BIT_1 now receives negative edge of clock, whose delay is

NEG_CK_AT_BIT_1 = (Latency of BIT_0) + (CLK->Q of BIT_0) + NAND_DELAY + (R -> Q of BIT_0) + (BIT_0/Q -> BIT_1/CK)

The width of high pulse that the flip-flop receives is equal to the difference between above two values:

HIGH_PULSE_WIDTH_AT_BIT_1 = NAND_DELAY + (R -> Q of BIT_0)

And low pulse is equal to

LOW_PULSE_WIDTH_AT_BIT_1 = CLK_PERIOD - HIGH_PULSE_WIDTH_AT_BIT_1

Now, depending upon the combinational delays mentioned as well as CLK_PERIOD, BIT_1 may receive a pulse (either high or low) with width less than what is permissible for its proper functionality. Thus, there will be a pulse width violation.

Looking at the equations for high and low pulse widths, it seems more probable to have high pulse width violating for BIT_1, unless either clock period is very less or buffering is there in reset path of BIT_0. We will need to increase buffering to reset pin of BIT_0 to increase width of high pulse and vice-versa.

The discussion we just had is applicable to any design in particular with reset controlling a signal driving another flip-flop's CK pin. However, one may argue following about this particular circuit:

This particular circuit is resistant to high pulse width violation, but may have low pulse width violations.

Can you argue in favor/against this statement? What is the reason for one making this statement. (Hint: Answer lies in the sequence of events causing CK to BIT_1 to go high and then go low).

Also read:

Asynchronous reset assertion timing scenarios

We have always heard that for asynchronous resets, only de-assertion needs to be timed. This is true for most of the designs as the guidelines followed for implementation of asynchronous resets. However, there may be a scenario wherein we may need to consider reset assertion as well for our timing checks. In this post, we will try to put some light on it.

In our post "recovery and removal checks", we have elaborated following for asynchronous resets:

* Reset assertion "combinationally" causes the output to go 0; i.e., assertion of reset does not wait for edge of the clock to alter the state of flip-flop

* Reset de-assertion waits for clock edge to propagate the value at "input" to "output". There are checks corresponding to de-assertion of reset with respect to clock known as "recovery" and "removal" checks.

Thus, we see that "recovery" and "removal" checks are defined only for de-assertion of asynchronous reset. So, one must think that there is no timing requirement for reset assertion. This is true in the sense that there is no requirement for reset assertion at the flip-flop that is receiving reset. And overall no timing requirement for a carefully implemented design. But some-times, there may be a corner-case scenario requiring the combinational path throuhg "reset -> output" to be timed. We will discuss some cases to elaborate our statement.

CASE 1: The output of flip-flop goes to another flip-flop, which itself is getting reset at the same time.

Here, since, the output in fanout is itself in reset state, it will not be sampling the output. Hence, there is no need to time the assertion of reset.

CASE 2: The output of flip-flop goes to an asynchronous domain or to a synchronizer which is not, itself, in reset. Here, the asynchronous domain flip-flop is expected to be a synchronizer in most of the cases. So, there does not arise a need for meeting timing.

CASE 3: The output of flip-flop goes to a flip-flop which is working on a synchronous clock and is expecting synchronous data. In this case, we will have to meet timing through the R -> Q of the source flip-flop and getting captured at the destination. We need to keep in mind that reset synchronizer also transfers through R -> Q for reset assertion. So, this case cannot be valid for global asynchronous reset assertion. For instance, let us consider below as a valid scenario.

For below case, we have to meet reset assertion timing from

ASYNC_RESET_SOURCE -> REG_B/R -> REG_B/Q -> REG_C/R -> REG_C/Q -> REG_D/D

Since, reset source is working on asynchronous clock, this is a design violation to get it captured on a flip-flop which is running functionally and expecting synchronous data.

Thus, there may not be any scenario to time asynchronous reset assertion globally. However, the state machines may locally utilize asynchronous reset assertion to get things done. For instance, you must have gone through a common problem known as "conversion of asynchronous counter to decade counter", wherein asynchronous reset pin is utilized to reset the count whenever count reaches 10. We will simplify it as "conversion of 2-bit asynchronous counter to modulo-3 counter" to serve our purpose.

Consider following design of 2-bit asynchronous counter. We have shown two flip-flops to register the output of this counter for illustration purposes and to make the understanding more clear. Flip-flop "BIT_1" gets the output of "BIT_0" as clock. All other flip-flops get CLK as clock. Whenever output of both "BIT_1" and "BIT_0" goes 1, the reset of both flops will cause output to go "00" in the same cycle. This is expected to reach the registers by next clock cycle so that it can be captured properly. The intermediate state "11" is not captured at the next stage as understood by basics of state machines.

We get following timing equations to be met in a single cycle (minus setup time and skews etc.).

BIT_1/CK -> BIT_1/Q -> BIT_1/R -> BIT_1/Q -> REG_1/D

BIT_1/CK -> BIT_1/Q -> BIT_0/R -> BIT_0/Q -> REG_0/D

BIT_0/CK -> BIT_0/Q -> BIT_1/R -> BIT_1/Q -> REG_1/D

BIT_0/CK -> BIT_0/Q -> BIT_0/R -> BIT_0/Q -> REG_0/D

Thus, this is an example of a scenario wherein we need timing for assertion of reset as well. It is captured at the next flip-flops running combinationally through the flip-flops' "Q" pin. Can you deduce the timing paths to be timed for the original case of "asynchronous counter as a decade counter" as well.

Another possible care-about of asynchronous reset assertion may be degradation of duty cycle, if the output is used as a clock such as highlighted path, where Q0 is being consumed as clock for "BIT_1" flip-flop. This can cause the minimum pulse width requirement to be violated for "BIT_1" flip-flop. This perspective of reset assertion is discussed here.

Also read:

Interview questions related to reset design and reset timing

VLSI UNIVERSE