How clock gating reduces power dissipation

As discussed in clock gating - basics, enable signal coming in data path is transferred into clock path in order to save dynamic power. But the question is exactly how is this power saved. In this post, we will discuss the same. 




A flip-flop implemented as a standard cell mostly has two internal inverters to generate clk' and clk_delay signals. So, even if the flip-flop input is kept constant, there is still toggling of data at these inverters, thereby dissipating dynamic power. In addition to this, there is internal power dissipation inside flip-flop due to charging and discharging of transistors' gates repetitively because of clock toggling, but this component is not a significant factor compared to dynamic power of inverters. Figure 2 below shows the internal structure of flip-flop, which has two latches in master-slave configuration and two inverters in clock path.

Figure 2: Flip-flop internal structure

Every clock cycle, these two inverters toggle regardless of flip-flop output toggling. However, implementation of clock gating will prohibit the toggling of these inverters when data is not toggling. Let us assume that a latch-based ICG is inserted. Thus, a mux in data path is replaced by an ICG in clock path. But there is a difference here. If there are, say 1000, flip-flops with same enable signal, there will be a common ICG inserted for these. Thus, instead of now 2000 inverters (inside 1000 flops) toggling when flip-flop output will be constant, we have only 2 inverters inside ICG consuming dynamic power. This is how dynamic power is saved. However, if only 1 flop had been clock gated in this manner, there would not have been any dynamic power saving, instead we have an ICG instead of a latch, it may result in overall loss in terms of area and power.

Whether there is any net saving is governed by how many flips-flops have been clock gated using a single ICG.

Also, as discussed, many muxes in data path with same enable are replaced by an ICG in clock path. Thus, there are advantages in terms of area and leakage power too, in addition to dynamic power.

2 comments:

  1. Nice information.. we will get expected data output from flipflop, But to the input clock of flipflop we are sending ANDed result of En and Clk.. what if En signal is low for 3 cycles and high for 2 cycles.. then Flipflop's input clock doesn't have 50% dutycycle.? id that fine?

    ReplyDelete
    Replies
    1. Hi, if En signal is high for 2 cycles and low for 3 cycles, clock will have 50% duty cycle for 2 cycles and 0% duty cycle for 3 cycles (speaking in ideal terms). :-) So no issue here.

      However, in reality also, the clock is never expected to have a 50% duty cycle. The buffers/inverters in clock path distort the clock waveform by at least minimal amount.

      Delete

Thanks for your valuable inputs/feedbacks. :-)