Performance gain with latches

The property of latches being transparent gives them a basic characteristic, known as time borrowing, owing to which they can capture data over a period of time rather than an instant. Using this property of latches intelligently can result in performance advantage for specific design scenarios, especially for designs having asymmetric data paths in subsequent stages. Let us elaborate with the help of an example.
Let us suppose a design having two stages of pipeline with combinational logic in each stage as 12 ns and 5 ns respectively as shown in figure 1 below:

Figure 1: 2-stage pipelining

If we assume clock period to be 16 ns (half cycle being 8 ns), then each latch stage will borrow time from the subsequent stage as shown in figure below:





.

Now, since all the registers get the same clock signal, the minimu clock period is the maximum of combinational delays from REGA to REGB and REGB to REGC.

Tclk > MAX (TcombregA->regB, Tcombr(regB->regC))



Thus, this circuit cannot run with half clock period less than 12 ns, or clock period less than 24 ns.

This situation can be easened up if we replace REGB with a negative level-sensitive latch. Let us have a look at figure 2 below. Although the number of stages still remains the same, LATB can borrow time from next stage without impacting any logic.

Figure 2: Latch replacing register in the 2-stage pipelining
The same is shown in figure 3 below with the help of waveform. The clock is having a period of 9 ns. The latch can borrow time of 3 ns from next stage, still meeting the setup time by 1 ns. Thus, we have succeeded in reducing the half time period from 12 ns to 9 ns (time period from 24 ns to 18 ns), just by changing the register to a latch. This is how a latch can help gain in performance.

If there are multiple latch stages in series, each can borrow from the subsequent stage such that overall timing is met. For example, figure 3 shows 6 latches in series.


2 comments:

  1. In the case described above of RegA -> LatB -> RegC, how will the setup and hold of LatB will be fullfilled as the combi delay from RegA -> LatB is 12ns which is more than clock period which is assumed to be 9ns.

    ReplyDelete
    Replies
    1. Hi

      Yes, the time period should have been 24 ns here. I have corrected the text. Thanks for your valuable feedback.

      Delete

Thanks for your valuable inputs/feedbacks. :-)