Showing posts with label clock gating. Show all posts
Showing posts with label clock gating. Show all posts

How clock gating reduces power dissipation

As discussed in clock gating - basics, enable signal coming in data path is transferred into clock path in order to save dynamic power. But the question is exactly how is this power saved. In this post, we will discuss the same. 




A flip-flop implemented as a standard cell mostly has two internal inverters to generate clk' and clk_delay signals. So, even if the flip-flop input is kept constant, there is still toggling of data at these inverters, thereby dissipating dynamic power. In addition to this, there is internal power dissipation inside flip-flop due to charging and discharging of transistors' gates repetitively because of clock toggling, but this component is not a significant factor compared to dynamic power of inverters. Figure 2 below shows the internal structure of flip-flop, which has two latches in master-slave configuration and two inverters in clock path.

Figure 2: Flip-flop internal structure

Every clock cycle, these two inverters toggle regardless of flip-flop output toggling. However, implementation of clock gating will prohibit the toggling of these inverters when data is not toggling. Let us assume that a latch-based ICG is inserted. Thus, a mux in data path is replaced by an ICG in clock path. But there is a difference here. If there are, say 1000, flip-flops with same enable signal, there will be a common ICG inserted for these. Thus, instead of now 2000 inverters (inside 1000 flops) toggling when flip-flop output will be constant, we have only 2 inverters inside ICG consuming dynamic power. This is how dynamic power is saved. However, if only 1 flop had been clock gated in this manner, there would not have been any dynamic power saving, instead we have an ICG instead of a latch, it may result in overall loss in terms of area and power.

Whether there is any net saving is governed by how many flips-flops have been clock gated using a single ICG.

Also, as discussed, many muxes in data path with same enable are replaced by an ICG in clock path. Thus, there are advantages in terms of area and leakage power too, in addition to dynamic power.

Clock gating checks in case of mux select transition when both clocks are running

PROBLEM: In the following figure, it is desired to toggle the select of the mux from CLOCK_DIV to CLOCK and both the clocks are running. What are the architectural and STA considerations for the same?

SOLUTION:
This is a very good example to understand how clock gating checks work, although you may/may not find any practical application for the same. We have to toggle the select of the multiplexer such that there is no glitch at the output. Let us consider architectural considerations first:

Architectural considerations:

Launching flip-flop of 'select' signal: In the post clock gating checks at a multiplexer, we discussed that if there is a mux getting clock at its inputs and select as data, then, there are two possible scenarios:

  • If the other clock is at state "0", then AND type check is formed and select has to launch from negative edge-triggered flip-flop
  • If the other clock is at state "1", then OR type check is formed and select has to launch from positive edge-triggered flip-flop
Now, since both the clocks are running simultaneously, both with act as "other clock" for each other. Let us choose to keep both the clocks at state "0" when select toggles. The same discussion holds true for the other scenario as well, just that appropriate values will hold. Thus,

(i) Both clocks required to be at state '0' when clock toggles
(ii) There is AND-type clock gating check formed between 'select' and both clocks 
(iii) 'select' launches from negative edge-triggered flip-flop.


Valid negative edges when 'select' can toggle: Now, as mentioned above both the clocks should be zero when select toggles. Figure below shows the valid and invalid edges where 'select' can toggle. As it turns out, select can toggle only on edges labelled "VALID" as both "CLOCK" and "DIV_CLOCK" will be zero then.

 So, to ensure that "SEL" toggles only when DIV_CLOCK is "0", we can add logic to the input of the flip-flop launching "SEL" such that it allows to propagate "SEL" only when DIV_CLOCK is "0".


In the above diagram, flip-flop launching "SEL" will hold its value when DIV_CLOCK = 0. We have to keep in mind that this implementation is just a representation of what needs to be done. the actual implementation may be more complex than this depending upon the requirements.

Timing considerations: Now coming to the timing considerations, we need to ensure that the setup and hold conditions are met, which are as shown in the figure below:

Also read:




Design problem: Clock gating for a shift register

Problem: There is an 4-bit shift register with parallel read and write capability as shown in the diagram. We need to find out an opportunity to clock gate the module.

 Mode selection bits ("S1" and "S0") are controlling the operation of this shift register with following settings:

Solution: From the basics of clock gating, we know that if the stae of a flip-flop is not chaging, there lies an opportunity to gate its clock. Observing the table, we see that state of all flip-flops does not change when "S1,S0" are either "00" or "11". So, when mode selection bits are corresponding to these values, we can gate the clock to this shift register. Or, we can say that clock to the module should reach only when (S1 xor S0) is equal to 1.


Can you relate the timing of S1 and S0? Should they be coming from positive edge-triggered flip-flop or negative edge-triggered flip-flop? Clock gating checks explains the timing of clock gating signals with respect to clock.

Also read:




Clock gating - basics

The dynamic power associated with any circuit is related to the amount of switching activity and the total capacitive load. In digital VLSI designs, the most frequently switching element are clock elements (buffers and other gates used to transport clock signal to all the synchronous elements in the design). In some of the designs, clock switching power may be contributing as high as 50% of the total power. Power being a very critical aspect, we need to make efforts to reduce this. Any effort that can be made to save the clock elements toggling can help in reducing the total power by a significant amount. Clock gating is one of the techniques used to save the dynamic power of clock elements in the design.

Principle behind clock gating: The principle behind clock gating is to stop the clock of those sequential elements whose data is not toggling. RTL level code talks only about data transfer. It may have some condition wherein a flip-flop will not toggle its output if that condition is met. Figure 1 below shows such a condition. In it, FF1's output will remain stable as long as EN = 0. On the right hand side, its equivalent circuit is provided, wherein EN has been translated into an AND gate in the clock path.This is a very simplistic version of what modern-day synthesis tools do to implement clock gating.

Figure 1: Clock gating implementation
Implications of clock gating: The implementation of clock gating, as expected, is not so simple. There are multiple things to be taken into account, some of which are:
  • Timing of enable (EN) signal: The gating of clock can cause a glitch in clock, if not taken care of by architectural implementation. Clock gating checks discusses what all needs to be taken care of as regards timing in clock gating implementation.
Area/power/latency trade-off: As is shown in figure 1, clock gating transfers a data-path logic into clock path. This can increase overall clock latency. Also, area penalty can be there, if the area of clock gating structure is more. Power can also increase, instead of decreasing, if only 1-2 flops' structure is replaced by clock gating (depending upon the switching power of clock gating structure vs those inside flip-flop). Normally, a bunch of flops with similar EN condition are chosen, and a common clock gating is inserted for those, thereby minimizing area and power penalties.

Also read:

Clock gating cell

Clock gating is a very common technique to save power by stopping the clock to a module when the module is not operating. As discussed in Clock switching and clock gating checks, there are two kinds of clock gating checks at combinational gates. We also discussed that for an AND type check, enable must launch from a negative edge-triggered flip-flop and for an OR type check, enable must launch from a positive edge-triggered flip-flop. However, it is very difficult to control the generic state machine to launch the signals to gate a clock either all from positive edge or from negative edge.

Evolution of integrated clock gating cell: To reduce the burden of same kind of launch registers from the state machine, an AND type clock gate can always be preceded with a negative level-sensitive latch and an OR type clock gate can be preceded with a positive level-sensitive latch. This has the same impact as a lockup latch in case of scan chain and eases hold timing. It results in zero cycle hold check from both positive and negative edge-triggered registers, without introduction of any additional latency. Since, each clock gate has to be preceded by a latch, why not build a special cell with an AND gate + a negative level-sensitive latch (or an OR gate + a positive level-sensitive latch). This concept served as motivation for Integrated Clock Gating Cell. This will provide more optimum area, power and timing for the resulting structure.

Test enable pin in integrated clock gating cell: During shift in scan testing, all the clock control signals have to be bypassed to let shifting happen. This can be achieved by providing a bypass signal called “test enable” that is ORed with functional enable signal (shown in figure 1 below). As soon as design goes into shift mode, test enable signal goes high, thereby bypassing all functional enable signals. So, it makes sense to embed this OR gate into integrated clock gating cell itself.


Structure of integrated clock gating cell: Figure 1 below shows the structure of the two kinds of integrated clock gating cells. The one on the left has an AND gate preceded by a negative level-sensitive latch. The enable and test_enable are active high. Clock_out has an inactive low state. The one on the right is complementary to this. It has an OR gate preceded by a positive level-sensitive latch. Both enable and test_enable are active high and output clock has an inactive high state. In case enable and test_enable are active low, NOR gate should be replaced by AND gate.

Figure 1: (a) AND type integrated clock gating cell                         Figure 1: (b) OR type integrated clock gating cell

Hope you’ve found this post useful. Let us know what you think in the comments.

Also read:




Quiz : Clock gating check at a complex gate

Problem: Consider a complex gate with internal structure as shown in figure below. One of the inputs gets clock while all others get data signals. What all (and what type of) clock gating checks exist?
Clock gating checks at a complex gate
Figure:Problem figure

Solution: As we know, clock gating checks can be of AND type or OR type. We can find the type of clock gating check formed between a data and a clock signal by considering all other signals as constant. Since, all the 4 data signals control Clk in one or the other way, there are following clock gating checks formed:

      i)        Clock gating check between Data1 and Clk: As is evident, invert of Clk and Data1 meet at OR gate ‘6’. Hence, there is OR type check between invert of Clk and Data1. In other words, Data1 can change only when invert of Clk is high or Clk is low. Hence, there is AND type check formed at gate 6.

      ii)        Clock gating check between Data2 and Clk: Same as in case 1.

      iii)       Clock gating check between Data3 and Clk: There is AND type check between Data3 and Clk.

     iv)     Clock gating check between Data4 and CLK: As in 1 and 2, there is AND type check between Data4 and Clk.

Also read



Clock gating checks

Today’s designs have many functional as well as test modes. A number of clocks propagate to different parts of design in different modes. And a number of control signals are there which control these clocks. These signals are behind switching on and off the design. Let us say, we have a simple design as shown in the figure below. Pin ‘SEL’ selects between two clocks. Also, ‘EN’ selects if clock will be propagating to the sub-design or not. Similarly, there are signals that decide what, when, where and how for propagation of clocks. Some of these controlling signals may be static while some of these might be dynamic. Even with all this, these signals should not play with waveform of the clock; i.e. these should not cause any glitch in clock path. There are both architectural as well as timing care-abouts that are to be taken care of while designing for signals toggling in clock paths. This scenario is widely known as ‘clock gating’. The timing checks that need to be modeled in timing constraints are known as ‘clock gating checks’.

Two clocks are going to a sub-part of design and are controlled by two signals. SEL is used to select which clock will propagate. Further, there is a signal EN which decides if selected clock will propagate or not
Figure 1: A simplest clocking structure
Definition of clock gating check: A clock gating check is a constraint, either applied or inferred automatically by tool, that ensures that the clock will propagate without any glitch through the gate.

Types of clock gating checks: Fundamentally, all clock gating checks can be categorized into two types:


AND type clock gating check: Let us say we have a 2-input AND gate in which one of the inputs has a clock and the other input has a data which will toggle while the clock is still on.

EN signal controlling CLK_in signal
Figure 2: AND type clock gating check; EN signal
controlling CLK_IN through AND gate
Since, the clock is free-running, we have to ensure that the change of state of enable signal does not cause the output of the AND gate to toggle. This is only possible if the enable input toggles when clock is at ‘0’ state. As is shown in figure 3 below, if ‘EN’ toggles when ‘CLK_IN’ is high, the clock pulse gets clipped. In other words, we do not get full duty cycle of the clock. Thus, this is a functional architectural miss causing glitch in clock path. As is evident in figure 4, if ‘EN’ changes during ‘CLK_IN’ are low, there is no change in clock duty cycle. Hence, this is the right way to gate a clock signal with an enable signal; i.e. make the enable toggle only when clock is low.

If the enable signal toggles when clock is high, the output clock from an AND gate will be glitchy
Figure 3: Clock being clipped when ‘EN’ changes when ‘CLK_IN’ is high

If enable signal toggles when clock is low, clock will pass without any glitch
Figure 4: Clock waveform not being altered when ‘EN’ changes when ‘CLK_IN’ is low


Theoretically, ‘EN’ can launch from either positive edge-triggered or negative edge-triggered flops. In case ‘EN’ is launched by a positive edge-triggered flop, the setup and hold checks will be as shown in figure 5. As shown, setup check in this case is on the next positive edge and hold check is on next negative edge. However, the ratio of maximum and minimum delays of cells in extreme operating conditions may be as high as 3. So, architecturally, this situation is not possible to guarantee the clock to pass under all conditions.


When enable signal is launched from a positive edge-triggered register/latch, hold check is on next negative edge, which cannot be met.
Figure 5: Clock gating setup and hold checks on AND gate when 'EN' launches from a positive edge-triggered flip-flop

On the contrary, if ‘EN’ launches from a negative edge-triggered flip-flop, setup check are formed with respect to the next rising edge and hold check is on the same falling edge (zero-cycle) as that of the launch edge. The same is shown in figure 6. Since, in this case, hold check is 0 cycle, both the checks are possible to be met for all operating conditions; hence, this solution will guarantee the clock to pass under all operating condition provided the setup check is met for worst case condition. The inactive clock state, as evident, in this case, is '0'.

When enable launches from negative-edge register/latch, hold check is zero cycle, which is possible to meet under all timing corners.
Figure 6: Clock gating setup and hold checks on AND gate when ‘EN’ launches from negative edge-triggered flip-flop


OR gate forming a clock gating check
Figure 7: An OR gate controlling a clock signal 'CLK_IN'
OR type clock gating check: Similarly, since the off-state of OR gate is 1, the enable for an OR type clock gating check can change only when the clock is at ‘1’ state. That is, we have to ensure that the change of state of enable signal does not cause the output of the OR gate to toggle. Figure 9 below shows if ‘EN’ toggles when ‘CLK_IN’ is high, there is no change in duty cycle. However, if ‘EN’ toggles when ‘CLK_IN’ is low (figure 8), the clock pulse gets clipped. Thus, ‘EN’ must be allowed to toggle only when ‘CLK_IN’ is high.


If the enable signal toggles when clock is low, the output clock from an OR gate will be glitchy
Figure 8: Clock being clipped when 'EN' changes when 'CLK_IN' is low

If enable signal toggles when clock is high, clock will pass without any glitch
Figure 9: Clock waveform not being altered when 'EN' changes when 'CLK_IN' is low


As in case of AND gate, here also, ‘EN’ can launch from either positive or negative edge flops. In case ‘EN’ launches from negative edge-triggered flop, the setup and hold checks will be as shown in the figure 10. The setup check is on the next negative edge and hold check is on the next positive edge. As discussed earlier, it cannot guarantee the glitch less propagation of clock.

When enable signal is launched from a negative register/latch, hold check is on next positive edge, which cannot be met.
Figure 10: Clock gating setup and hold checks on OR gate when ‘EN’ launches from negative edge-triggered flip-flop

If ‘EN’ launches from a positive edge-triggered flip-flop, setup check is with respect to next falling edge and hold check is on the same rising edge as that of the launch edge. The same is shown in figure 11. Since, the hold check is 0 cycle, both setup and hold checks are guaranteed to be met under all operating conditions provided the path has been optimized to meet setup check for worst case condition. The inactive clock state, evidently, in this case, is '1'.
When enable launches from positive-edge register/latch, hold check is zero cycle, which is possible to meet under all timing corners.
Figure 11: Clock gating setup and hold checks on OR gate when 'EN' launches from a positive edge-triggered flip-flop

We have, thus far, discussed two fundamental types of clock gating checks. There may be complex combinational cells other than 2-input AND or OR gates. However, for these cells, too, the checks we have to meet between the clock and enable pins will be of the above two types only. If the enable can change during low phase of the clock only, it is said to be AND type clock gating check and vice-versa.

SDC command for application of clock gating checks: In STA, clock gating checks can be applied with the help of SDC command set_clock_gating_check.

Also read: