Showing posts with label STA interview questions. Show all posts
Showing posts with label STA interview questions. Show all posts

Intricacies in handling of half cycle timing paths

What is a half cycle path? A half cycle timing path is one in which launch and capture happen on different clock edges. A half cycle path can be in terms of both setup and hold. However, normally, in technical terms half cycle path is the one which has setup check getting formed as half cycle. For instance, following are some of the examples of half cycle timing paths:


  1. A timing path from positive edge-triggered flip-flop to a negative edge-triggered flip-flop and vice-verse. Here, hold check is also half cycle on the previous edge
  2. A timing path from a positive level-sensitive latch to a negative level-sensitive latch and vice-verse. Here, hold check is zero cycle
  3. A timing path from a negative edge-triggered flip-flop forming a clock gating check on AND gate (Here, hold check is zero cycle)
  4. A timing path from a positive edge-triggered flip-flop forming a clock gating check on OR gate (here, hold check is zero cycle)
There are also, some cases where hold check is half cycle and setup check is single/zero cycle. These are:
  1. A timing path from a negative edge-triggered flip-flop forming a clock gating check on OR gate (Here, setup check is single cycle check)
  2. A timing path from a positive edge-triggered flip-flop forming a clock gating check on AND gate (Here, setup check is single cycle check)
In addition, minimu pulse width checks should also be considered same as half cycle timing paths. But, in this case, start-point and end-point are the same register.

In this post, we will be considering only setup timings paths as example, although the complete discussion applies on all kinds of half cycle setup paths/checks. To start with, let us note down the most simple setup check equation for half cycle timing paths.

Tck->q + Tprop + Tsetup  < (Tperiod/2) + Tskew

Let us now discuss some of the intricacies that we should be aware of while dealing with half cycle timing paths:

Clock source duty cycle variation: There is always a variation in duty cycle of the clock source due to uncertainty in the relative timings of positive and negative edges. Duty cycle variation is always measured with respect to corresponding positive and negative edges. In other words, we can also say that duty cycle variation is the uncertainty in arrival of negative edge, given that positive edge has arrived at certain fixed point of time. Let us take an example. If we are given a clock with a period of 10 ns with ideal 50% duty cycle. Also, we are given that it has the clock has a duty cycle variation of +-5%. So, if we say that we saw positive edge of clock at 100 ns, we can expect to see negative edge of clock at any time between 14.5 ns and 15.5 ns. Following waveform illustrates this. You can read my earlier post duty cycle variation to have a more detailed elaboration.

So, the setup check equation modifies as:



Tck->q + Tprop + Tsetup  < (Tperiod/ 2- Tsdc) + Tskew
where Tsdc is the clock source duty cycle variation. Thus, the effective half clock period reduces by an amount equal to duty cycle variation.

Duty cycle degradation In addition to source duty cycle variation, there can be assymmetry in rise delay vs fall delay of clock elements. For instance, a buffer may have nominal rise (0 -> 1) delay of 50 ns whereas 48 ns for fall delay (1 -> 0). So, if a clock pulse passes through it, it will eat a portion of this clock pulse as shown in figure 1 below. For more clarity, we have exaggerated the scenario with a fall delay of 30 ns.

So, a half cycle may be larger of smaller than actual half cycle at the clock pin. In the above case, positive to negative edge setup check will be tighter by 20 ns and negative-> positive setup check will be relaxed by same amount (neglective OCVs as of now). So, the modified setup equation, now, becomes:
Tck->q + Tprop + Tsetup  < (Tperiod/2 - Tsdc) + (Tskew - Tdcd)
As discussed above also, Tdcd can be positive or negative depending upon if rise-fall variation of cells is helping or oppsing.

Can you think of some other scenario that is specific only to half cycle timing paths? Do share, if you do.

What is the difference between a normal buffer and clock buffer?

A buffer is an element which produces an output signal, which is of the same value as the input signal. We can also refer a buffer as a repeater which repeats the signal it is receiving, just as there are repeaters in telephone signal transmission lines. You must have noticed that we have two kinds of buffers (or any logic gate) available in standard cell libraries as:

  • Clock buffer: The clock buffers are designed specifically to have specific properties that are supposed to be good for clock distribution networks (clock trees). The specific properties that are required in an ideal clock tree buffer are given as below. However, it is not possible to attain these ideal properties for every buffer at every technology node. It may be only possible to get close to these properties.
    • Equal rise and fall times
    • Less delays
    • Less delay variations with PVT and OCV
  • Normal buffer/data buffer: For a data buffer, the above properties are usually less desired
Usually, we can say that following differences may exist between a clock buffer and a normal buffer:
  • In SoCs, clock routing is done in higher metal layers as compared to signal routing. So, to provide easier access to clock pins from these layers, clock buffers may have pins in higher metal layers. That is, vias are provided in standard cell itself instead of necessitating on having in clock distribution network. For a data buffer, the pins are expected to be in lower layers only.
  • Clock buffers are balanced. In other words, rise and fall times of clock buffers are nearly equal. The reason behind this is that if the clock buffers are not balanced, there will be duty cycle distortion in the clock tree, which can lead to pulse width violations as discussed in minimum pulse width violation example. On the other hand, data buffers can compromise with either of rise/fall times. In other words, they dont need to have PMOS/NMOS size to be 2:1; and hence, can be of smaller size as compared to clock buffers.
  • Due to above reason, clock buffers consume more power as compared to normal buffers.
  • Generally, you will find clock buffers with higher drive strength as compared to normal buffers. So that a clock buffer can drive long nets and can have higher fanouts. This helps clock buffers, and hence, clock trees to have less overall delays.

What is meant by drive strength of a standard cell

As we know that cell delay is a function of output load capacitance. The most simplistic equivalent circuit of a logic gate driving an output can be assumed as given in figure 1:


The purpose of logic gate is to propagate the effect of logic value available at its input to the output. Based upon whether '0' or '1' is to be propagated to the output. The corresponding is achieved by charging and discharging of the output load capacitance. Propagating a logic '0' will mean discharging of the load capacitance, and vice-versa. Drive strength of the logic gate is the its relative capability to charge/discharge the capacitance present at its output. Now, the time constant, and hence, delay of the circuit is "RC".
So, for a cell with higher drive strength, corresponding "R" is lesser than the one with lower drive strength. So that for same load capacitance "C", delay is lower for a cell with higher drive strength as it can charge the capacitance in lesser time.

How drive strength varies with size of a cell: Let us talk in terms of MOSFETs, although this is valid in terms of every device in general. We know that for a given technology standard cell library, length of all transistors is kept constant. For instance, 90 nm technology will have gate length of all transistors as ~90 nm. And channel resistance of the MOSFET is inversely proportional to "W/L" of the transistor. So, a simple way to decrease channel resistance is to increase "W" of the transistor. So, a transistor with more area will have lesser resistance. Or we can say that a logic gate with bigger transistors will have more drive strength.

What is unit drive strength: In a standard cell library, we generally see cells labelled as "1X", "2X" and so on. But what is meant by the number that you see with drive strength? In general, the lowest size logic gate is labelled as unit drive strength. The drive strength numbers of other cells are laelled relative to unit drive strength cell.

Read next: How delay of a cell changes with drive strength

Also read:

How to fix min pulse width violation

In our previous posts, we discussed about the duty cycle, duty cycle variation and duty cycle degradation. Bad duty cycle impacts half cycle timing paths and has impact in meeting timing for minimum pulse width checks of flip-flops. However, there are certain techniques available that can help you in improving the duty cycle of the clock. We will discuss these techniques in this post as below:

1. Dual inversion in clock branch: A certain category of logic cells are more probable of having one of the rise or fall delays greater than the other. A chain of such cells will make either high pulse of clock shorter or low pulse of clock shorter. One can use an inverter in the middle of the chain as shown in figure below to tackle this. Doing this, what we are essentially doing is converting rise edges to fall and vice-versa. So, the shortening of pulse of first few elements is balanced with the rest of the elements. In the below figure, there are 20 buffers, each shortening the pulse by 10 ps. The output of 10th buffer will have a shorter pulse as compared to clock source. The inverter at the output of 10th buffer will feed an inverted clock to 11th buffer. This will have high pulse which is greater than low pulse. Rest of the chain will try to reduce this pulse. In the end, we get a pulse which is equal to what was available at the source.


One can also try an all-inverter clock tree. In an all-inverter clock tree, every element will change the sense of clock pulse; thereby minimizing the clock pulse distortion.

However, this kind of delay balancing will only work where there is inherent variation of delays in rise vs fall. It will not work in case of OCV variations. So, if the chain length is arbitrarily large, our second method will come to rescue.

2. Even division to tackle duty cycle degradation: Suppose there is source clock with very poor duty cycle (say 10%) and you divide down the clock by 2 with a flip-flop divider. What we observe is amazing. The resulting clock is having almost 50% duty cycle. So, whenever we need an output clock with perfect duty cycle, we can use a divider to divide down the clock. The only drawback of this method is that we need a source clock of frequency twice than what is required to be timed!!


There are a few things to be kept in mind for this method:

  • This method will improve the duty cycle of clock at the output of the flop. Degradation in duty cycle happening after the divider, if any, will be there.
  • Duty cycle of the input clock at flip-flop must be within the limits of what is required to be minimum pulse width at the flip-flop.

Clock jitter

Clock jitter: By definition, clock jitter is the deviation of a clock edge from its ideal position in time. Simply speaking, it is the inability of a clock source to produce a clock with clean edges. As the clock edge can arrive within a range, the difference between two successive clock edges will determine the instantaneous period for that cycle. So, clock jitter is of importance while talking about timing analysis. There are many causes of jitter including PLL loop noise, power supply ripples, thermal noise, crosstalk between signals etc. Let us elaborate the concept of clock jitter with the help of an example:

A clock source (say PLL) is supposed to provide a clock of frequency 10 MHz, amounting to a clock period of 100 ns. If it was an ideal clock source, the successive rising edges would come at 0 ns, 100 ns, 200 ns, 300 ns and so on. However, since, the PLL is a non-ideal clock source, it will have some uncertainty in producing edges. It may produce edges at 0 ns, 99.9 ns, 201 ns etc. Or we can say that the clock edge may come at any time between (<ideal_time>+- jitter); i.e. 0, between 99-101 ns, between 199-201 ns etc (1 ns is jitter). However, counting over a number of cycles, average period will come out to be ~100 ns.

Figure 1 below shows the generic diagram for clock jitter:



Please note that the uncertainty in clock edge can be for both positive as well as negative edges (above example showed only for positive edges). So, there are both full cycle and half cycle jitters. By convention, clock jitter implies full cycle clock jitter.


Types of clock jitter: Clock jitter can be measured in many forms depending upon the type of application. Clock jitter can be categorized into cycle-to-cycle, period jitter and long term jitter.
  • Cycle to cycle jitter: By definition, cycle-to-cycle jitter signifies the change in clock period accross two consecutive cycles. For instance, it will be difference in periods for 1st and 2nd cycles, difference in periods for 10th and 11th cycles etc. It has nothing to do with frequency variation over time. For instance, in figure below, the clock has drifted in frequency (from period = 10 ns to period = 1 ns), still maintaining a cycle-to-cycle jitter of 0.1 ns. In other words, if t2 and t1 are successive clock periods, then cycle_to_cycle_jitter = (t2 - t1).

  • Period jitter: It is defined as the "deviation of any clock period with respect to its mean period". In other words, it is the difference between the ideal clock period and the actual clock period. Period jitter can be specified as either RMS period jitter or peak-to-peak period jitter.
    • Peak-to-peak period jitter: It is defined as the jitter value measuring the difference between two consecutive edges of clock. For instance, if the ideal period of the clock was 20 ns, then for clock shown above,
      • for first cycle, peak-to-peak period jitter = (20 - 20) = 0 ns
      • for second cycle, peak-to-peak period jitter = (20 - 19.9) = 0.1 ns
      • for last cyle, peak-to-peak period jitter = (20 - 1) = 19 ns
    • RMS period jitter: RMS period jitter is simply the root-mean-square of all the peak-to-peak period jitters available.

  • Long term jitter: Long term jitter is the deviation of the clock edge from its ideal position. For instance, for a clock with period 20 ns, ideally, clock edges should arrive at 20 ns, 40 ns and so on. So, if 10th edge comes at 201 ns, we will say that the long term jitter for 10th edge is 1 ns. Similarly, 1000th edge will have a long term jitter of 0.5 ns if it arrives at 20000.5 ns.

Let us try to understand the difference between all the three kinds of jitter with the help of an illustrative example waveform below:


Reference:
* Understanding SYSCLK jitter

Also read:

Clock skew


Clock skew is one of the most important parameters of a good physical design implementation. Keeping the clock skew to a minimum is considered to be a good measure of clock tree synthesis. 

Definition of clock skew: Clock skew between two flip-flops represents the difference in arrival times of clock signal at the respective clock pins. If there is a timing path being formed between the two flip-flops, then we can attribute a sign to the clock skew. In that case, clock skew is given as:
Clock skew = (Arrival time at capture clock pin) - (Arrival time at launch clock pin)
Thus, based upon the sign of clock skew, we get two types of clock skew labelled as positive skew and negative skew.

Positive clock skew: If the clock arrival time at capture flip-flop is greater than that at launch flip-flop, clock skew is said to be positive. Assuming all buffers take the same delay, figure 1 shows a scenario of positive clock skew.


As shown in figure 1 above for the case of positive clock skew, flip-flop capturing data is getting delayed clock signal. So, the data that is launched gets additional time before it is captured at the next edge. So, setup check gets relaxed by the amount equivalent to clock skew. On the other hand, for hold check, the data has to be kept stable for an extra amount of time equal to the clock skew. So, hold check gets tightened in case clock skew is positive. The same is shown in figure 2 below.




Negative clock skew: Contrary to positive clock skew, if the clock arrival time at capture flip-flop is less than the launch flip-flop, clock skew is said to be negative. Figure 3 shows a scenario of negative clock skew as the launch flip-flop getting a delayed version of clock signal.



Since, the launching flip-flop is getting a delayed version of clock, the data launched gets less than one clock period to travel to the capturing flip-flop. So, negative clock skew makes setup check tighter by the magnitude of clock skew. On the other hand, for hold check, data has to be stable for less time after the arrival of clock edge. In other words, hold check gets relaxed by the same amount. Figure 4 below shows the scenario of negative clock skew.



STA

Static timing analysis (STA) is a vast domain involving many sub-fields. It involves computing the limits of delay of elements in the circuit without actually simulating it. In this post, we have tried to list down all the posts that an STA engineer cannot do without. Please add your feedback in comments to make reading it a more meaningful experience.

  • Metastability - This post discusses the basics of metastability and how to avoid it.
  • Lockup latch - The basics of lockup latch, both from timing and DFT perspective have been discussed in this post.

  • Clock latency - Read this if you wish to get acquainted with the terminology related to clock latency

  • Data checks - Non-sequential setup and hold checks have been discussed, very useful for beginners

  • Synchronizers - Different types of synchronizers have been discussed in detail

  • On-chip variations - Describes on-chip variations and the methods undertaken to deal with these
  • Temperature inversion - Discusses the concept of temperature inversion and conductivity trends with temperature

  • Timing arcs - Discusses the basics of timing arcs, positive and negative unateness, cell arcs and net arcs etc.

  • Basics of latch timing - Definition of latch, setup time and hold timing of a latch, latch timing arcs are discussed