Clock gating cell
Clock gating is
a very common technique to save power by stopping the clock to a module when
the module is not operating. As discussed in Clock
switching and clock gating checks, there are two kinds of clock gating
checks at combinational gates. We also discussed that for an AND type check,
enable must launch from a negative edge-triggered flip-flop and for an OR type
check, enable must launch from a positive edge-triggered flip-flop. However, it
is very difficult to control the generic state machine to launch the signals to
gate a clock either all from positive edge or from negative edge.
Evolution of integrated clock gating
cell: To reduce the burden of same kind of launch registers from the
state machine, an AND type clock gate can always be preceded with a negative
level-sensitive latch and an OR type clock gate can be preceded with a positive
level-sensitive latch. This has the same impact as a lockup
latch in case of scan chain and eases hold timing. It results in zero cycle
hold check from both positive and negative edge-triggered registers, without
introduction of any additional latency. Since, each clock gate has to be
preceded by a latch, why not build a special cell with an AND gate + a negative
level-sensitive latch (or an OR gate + a positive level-sensitive latch). This concept served as motivation for Integrated Clock Gating Cell. This
will provide more optimum area, power and timing for the resulting structure.
Test enable pin in integrated clock
gating cell: During shift in scan testing, all the clock control signals
have to be bypassed to let shifting happen. This can be achieved by providing a
bypass signal called “test enable” that is ORed with functional enable signal (shown in figure 1 below).
As soon as design goes into shift mode, test enable signal goes high, thereby
bypassing all functional enable signals. So, it makes sense to embed this OR
gate into integrated clock gating cell itself.
Structure of integrated clock gating
cell: Figure 1 below shows the structure of the two kinds of integrated
clock gating cells. The one on the left has an AND gate preceded by a negative
level-sensitive latch. The enable and test_enable are active high. Clock_out
has an inactive low state. The one on the right is complementary to this. It
has an OR gate preceded by a positive level-sensitive latch. Both enable and
test_enable are active high and output clock has an inactive high state. In case enable and test_enable are active low, NOR gate should be replaced by AND gate.
Hope you’ve found this post useful. Let us know what you think in the comments.![]() | |
|
Also read:
Minimum pulse width
All the sequential elements need some minimum pulse (either high or low) to ensure that the data has been captured correctly. In other words, clock pulse fed to a flop or latch (or any other sequential element) must be wide enough so that it does not interfere with correct functionality of the element. By correct functionality, is meant, the internal operations of the cell.
Minimum pulse width requirement: To understand minimum pulse width requirement, let us first define pulse width. Formally, pulse width can be defined as:
"If talking in terms of high signal level (high minimum pulse width), it is the time interval between clock signal crossing half the VDD level during rising edge of clock signal and clock signal crossing half the VDD level during falling edge of clock signal. If talking in terms of low signal level (low minimum pulse width), it is the time interval between clock signal crossing half the VDD level during falling edge of the clock signal and clock signal crossing half the VDD level during rising edge of the clock signal."
If the clock being fed to a sequential object has less pulse width than the minimum required, either of the following is the probable output:
- The flop can capture the correct data and FSM will functional correctly
- The flop can completely miss the clock pulse and does not capture any new data. The FSM will, then, lead to invalid state
- The flop can go meta-stable
All these scenarios are probable of happening; so, it is required to ensure every sequential element always gets a clock pulse greater than minimum pulse width required. To ensure this, there are ways to communicate to timing analysis tool the minimum pulse width requirement for each and every sequential element. The check to ensure minimum pulse width is known as "minimum pulse width check". There are following ways to ensure minimum pulse width through minimum pulse width check:
- Through liberty file: By default, all the registers in a design should have a minimum pulse width defined through liberty file as this is the format to convey the standard cell requierements to STA tool. By convention, minimum pulse width should be defined for clock and reset pins. Minimum pulse width is constrained in liberty file using following syntax:
- Through SDC command: We can also define minimum pulse width requirement through SDC command. The SDC command for the same is "set_min_pulse_width". For example, following set of commands will constrain the minimum pulse width of clock clk to be 5 ns high and 4 ns low:
set_min_pulse_width -low 4 [get_clocks clk]
Measure time using candle!!
Problem statement: You have two candles that can be burnt from both sides. Each candle takes exactly one hour to get burnt completely. You have to measure 45 minutes with the help of these candles. How will you do it?
Solution: Here, we cannot assume that the candle will burn uniformly. So, we cannot assume that half the candle will get burnt in half an hour and so on. But, if we can somehow let it burn for half an hour, then we can surely assume that the rest will burn in half hour only.
The solution of this puzzle follows:
First, light one of the candles from both sides. Simultaneously, light the other candle from one side only. Now, the first candle will get exhausted in half an hour only as discussed above. After half hour, the other candle has another half an hour left in it. As soon as first candle gets burnt up completely, light the other end of the second candle. Now, the second candle will get exhausted after another 15 minutes as it will burn at twice the rate.
This is how we can measure 45 minutes using two candles that can be burnt from both the sides.
Setup checks and hold checks for latch-to-flop timing paths
There can be 4 cases of latch-to-flop timing paths as discussed below:
1. Positive level-sensitive latch to positive edge-triggered register: Figure 1 below shows a timing path being launched from a positive level-sensitive latch and being captured at a positive edge-triggered register. In this case, setup check will be full cycle with zero-cycle hold check. Time borrowed by previous stage will be subtracted from the present stage.
![]() |
| Figure 1: Positive level-sensitive latch to positive edge-triggered register timing path |
Timing waveforms corresponding to setup check and hold check for a timing path from positive level-sensitive latch to positive edge-triggered register is as shown in figure 2 below.
![]() |
| Figure 2: Setup and hold check waveform for positive latch to positive register timing path |
2. Positive level-sensitive latch to negative edge-triggered register: Figure 3 below shows a timing path from a positive level-sensitive latch to negative edge-triggered register. In this case, setup check will be half cycle with half cycle hold check. Time borrowed by previous stage will be subtracted from the present stage.
Timing waveforms corresponding to setup check and hold check for timing path starting from positive level-sensitive latch and ending at negative edge-triggered register is shown in figure 4 below:
![]() |
| Figure 4: Setup and hold check waveform for timing path from positive latch to negative register |
3. Negative level-sensitive latch to positive edge-triggered register: Figure 5 below shows a timing path from a negative level-sensitive latch to positive edge-triggered register. Setup check, in this case, as in case 2, is half cycle with half cycle hold check. Time borrowed by previous stage will be subtracted from the present stage.
Timing waveforms for path from negative level-sensitive latch to positive edge-triggered flop are shown in figure 6 below:
![]() |
| Figure 6: Waveform for setup check and hold check corresponding to timing path from negative latch to positive flop |
4. Negative level-sensitive latch to negative edge-triggered register: Figure 7 below shows a timing path from negative level-sensitive latch from a negative edge-triggered register. In this case, setup check will be single cycle with zero cycle hold check. Time borrowed by previous stage will be subtracted from present stage.
Figure 8 below shows the setup check and hold check waveform from negative level-sensitive latch to negative edge-triggered flop.
![]() |
| Figure 8: Timing waveform for path from negative latch to negative flip-flop |
16x1 mux using 4x1 muxes
Implementing 16:1 multiplexer with 4:1 multiplexers: A 16x1 mux can be implemented
using 5 4x1 muxes. 4 of these multiplexers can be used as first stage to mux 4 inputs each with two
least significant bits of select lines (S0 and S1), resulting in 4 intermediate
outputs, which, then can be muxed again using a 4:1 mux. The implementation of
16x1 mux using 4x1 muxes is shown below in figure 1:
![]() |
| Figure 1: Implementing 16:1 mux with the help of 4:1 multiplexers |
The above approach assumes that all the inputs are of same priority as regards timing. Can you think of a solution which involves timing and prioritizes some of the inputs? A hint for you is that the solution will require 5 select lines instead of four.
Also read:
Why is body connected to ground for all nmos and not to VDD
To prevent latch-up in CMOS, the body-source and body-drain diodes should not be forward biased; i.e, body terminal should be at same or lesser voltage than source terminal (for an NMOS; for a PMOS, it should be at higher voltage than source). This condition will be satisfied if we connect all the nmos bodies to their respective sources. But we see that all the body terminals are connected to a common ground.
This is due to the reason that all the nmos transistors share a common substrate, and a substrate can only be biased to one voltage. Although it introduces body effect and makes transistors slower and deviate from ideal mos current equation, there is no other way.
One could achieve different body voltage for all nmos transistors by putting all transistors in different wells, but that would mean a tremendous penalty in terms of area as there needs to be minimum size and separation that needs to be maintained which is huge in comparison to transistor sizes. This is the reason why body is connected to ground for all NMOS.
Similarly, body of all PMOS transitors is connected to a common terminal VDD.
Similarly, body of all PMOS transitors is connected to a common terminal VDD.
Subscribe to:
Comments (Atom)











