Showing posts with label Digital system design. Show all posts
Showing posts with label Digital system design. Show all posts

Divide by 2 clock in VHDL

Clock dividers are ubiquitous circuits used in every digital design. A divide-by-N divider produces a clock that is N times lesser frequency as compared to input clock. A flip-flop with its inverted output fed back to its input serves as a divide-by-2 circuit. Figure 1 shows the schematic representation for the same.

A divide by 2 clock circuit produces output clock that is half the frequency of the input clock
Divide by 2 clock circuit
                                          
Following is the code for a divide-by-2 circuit.
-- This module is for a basic divide by 2 in VHDL.
library ieee;
use ieee.std_logic_1164.all;
entity div2 is
                port (
                                reset : in std_logic;
                                clk_in : in std_logic;
                                clk_out : out std_logic
                );
end div2;

-- Architecture definition for divide by 2 circuit
architecture behavior of div2 is
signal clk_state : std_logic;
begin
                process (clk_in,reset)
                begin
                                if reset = '1' then
                                                clk_state <= '0';
                                elsif clk_in'event and clk_in = '1' then
                                                clk_state <= not clk_state;
                                end if;
                end process;
clk_out <= clk_state;

end architecture;

Hope you’ve found this post useful. Let us know what you think in the comments.

Delay line based Time to digital converter

A time to digital converter is a circuit that digitizes time; i.e., it converts time into digital number. In other words, a time-to-digital converter measures the time interval between two events and represents that interval in the form of a digital number.

TDCs are used in places where the time interval between two events needs to be determined. These two events may, for example, be represented by rising edges of two signals. Some applications of TDCs include time-of-flight measurement circuits and All-Digital PLLs.

Delay line based time-to-digital converter: This is a very primitive TDC and involves a delay-line which is used to delay the reference signal. The other signal is used to sample the state of delay chain. Each stage of delay chain outputs to a flip-flop or a latch which is clocked by the sample signal. Thus, the output of the TDC forms a thermometer code as the stage will show a ‘1’ if the reference signal has passed it, otherwise it will show a zero. The schematic diagram of delay line based time-to-digital converter is shown in figure 1 below:

his is a very primitive TDC and involves a delay-line which is used to delay the reference signal. The other signal is used to sample the state of delay chain. Each stage of delay chain outputs to a flip-flop or a latch which is clocked by the sample signal. Thus, the output of the TDC forms a thermometer code as the stage will show a ‘1’ if the reference signal has passed it, otherwise it will show a zero.
Figure 1: Delay line based Time-to-digital converter


The VHDL code for delay line based time-to-digital converter is given below:
-- This is the module definition of delay line based time to digital converter.
library ieee;
use ieee.std_logic_1164.all;        
entity tdc is
                generic (
                                number_of_bits : integer := 64
                );
                port (
                                retimed_clk : in std_logic;
                                variable_clk : in std_logic;
                                tdc_out : out std_logic_vector (number_of_bits-1 downto 0);
                                reset : in std_logic
                );
end entity;
architecture behavior of tdc is
                component buffd4 is port (
                                I : in std_logic;
                                Z : out std_logic
                );
                end component;
                signal buf_inst_out : std_logic_vector (number_of_bits downto 0);
begin
--buffd4
                buf_inst_out(0) <= variable_clk;
                tdc_loop : for i in 1 to (number_of_bits) generate
                begin
                                buf_inst : buffd4 port map (
                                                I => buf_inst_out(i-1),
                                                Z => buf_inst_out(i)
                                );
                end generate;

                process (reset,retimed_clk)
                begin
                                if reset = '1' then
                                                tdc_out <= (others => '0');
                                elsif retimed_clk'event and retimed_clk = '1' then
                                                tdc_out <= buf_inst_out(number_of_bits downto 1);
                                end if;
                end process;
end architecture;

References:

Hope you’ve found this post useful. Let us know what you think in the comments.

How propagation of ‘X’ happens through different logic gates


‘X’ refers to a signal attaining a value that is ‘unknown’. It can be either ‘0’ or ‘1’. But, the exact value of the signal is not known. If a simulator is not able to decide whether a logic value should be logic ‘0’ or logic ‘1’, it will assign a value ‘X’ to the value. An example of ‘X’ source is a logic block that has not been initialized properly through reset. Having an ‘X’ value at a node can propagate to the logic lying in the fan-out, thereby increasing the uncertainty downstream the logic.

How ‘X’ propagates: An ‘X’ value at the input of a logic gate may or may not propagate to its output depending upon the states at other inputs of the logic gate. Given below is how different logic gates react to ‘X’ values:

1)  OR gate: An OR gate can absorb an ‘X’ if the other input has logic ‘1’. Otherwise, ‘X’ propagates through it. Please refer figure 1 for explanation:
A logic '1' at the other input saves 'X' from propagating through an OR gate, whereas a logic '0' causes 'X' at the other input to propagate to the output.
Figure 1: X-propagation through OR gate

2) AND gate: An AND gate can absorb ‘X’ if the other input has logic ‘0’. Otherwise, ‘X’ propagates through it. Please refer figure 2 for explanation:

Figure 2: X-propagation through AND gate


3) Buffer/inverter: Since, buffers/inverters are single input gates, an ‘X’ at the input means ‘X’ at the output.

4) XOR gate: An ‘X’ at one of the inputs of XOR produces ‘X’ at the output no matter what the 
other input state is. Please refer truth table given in Figure 3 for explanation:

An 'X' at the input of an XOR gate propagates to the output no matter what is the state of the other input.
Figure 3: X-propagation through XOR gate

5) Complex gates: For complex gates, whether ‘X’ will propagate to the output depends upon the overall function of the ‘X’ input with respect to other gates. E.g. suppose a gate with function

Z = A + (B * C)

Then, if B input goes ‘X’, the output will go ‘X’ if A=0 and C=1.


Read also:

On-chip variations – the STA takeaway

Static timing analysis of a design is performed to estimate its working frequency after the design has been fabricated. Nominal delays of the logic gates as per characterization are calculated and some pessimism is applied above that to see if there will be any setup and/or hold violation at the target frequency. However, all the transistors manufactured are not alike. Also, not all the transistors receive the same voltage and are at same temperature.  The characterized delay is just the delay of which there is maximum probability. The delay variation of a typical sample of transistors on silicon follows the curve as shown in figure 1. As is shown, most of the transistors have nominal characteristics. Typically, timing signoff is carried out with some margin. By doing this, the designer is trying to ensure that more number of transistors are covered. There is direct relationship between the margin and yield. Greater the margin taken, larger is the yield. However, after a certain point, there is not much increase in yield by increasing margins. In that case, it adds more cost to the designer than it saves by increase in yield. Therefore, margins should be applied so as to give maximum profits.

Most of the transisors have close to nominal delay. However, some transistors have delay variations. Theoretically, there is no bound existing for delay variations. However, probabilty of having that delay decreases as delay gets far from nominal.
Number of transistors v/s delay for a typical silicon transistors sample


We have discussed above how variations in characteristics of transistors are taken care of in STA. These variations in transistors’ characteristics as fabricated on silicon are known as OCV (On-Chip Variations). The reason for OCV, as discussed above also, is that all transistors on-chip are not alike in geometry, in their surroundings, and position with respect to power supply. The variations are mainly caused by three factors:
  • Process variations: The process of fabrication includes diffusion, drawing out of metal wires, gate drawing etc. The diffusion density is not uniform throughout wafer. Also, the width of metal wire is not constant. Let us say, the width is 1um +- 20 nm. So, the metal delays are bound to be within a range rather than a single value. Similarly, diffusion regions for all transistors will not have exactly same diffusion concentrations. So, all transistors are expected to have somewhat different characteristics.
  • Voltage variation: Power is distributed to all transistors on the chip with the help of a power grid. The power grid has its own resistance and capacitance. So, there is voltage drop along the power grid. Those transistors situated close to power source (or those having lesser resistive paths from power source) receive larger voltage as compared to other transistors. That is why, there is variation seen across transistors for delay.
  • Temperature variation: Similarly, all the transistors on the same chip cannot have same temperature. So, there are variations in characteristics due to variation in temperatures across the chip.


How to take care of OCV: To tackle OCV, the STA for the design is closed with some margins. There are various margining methodologies available. One of these is applying a flat margin over whole design. However, this is over pessimistic since some cells may be more prone to variations than others. Another approach is applying cell based margins based on silicon data as what cells are more prone to variations. There also exist methodologies based on different theories e.g. location based margins and statistically calculated margins. As advances are happening in STA, more accurate and faster discoveries are coming into existence.

Latency and throughput – the two measures of system performance

Performance of the system is one of the most stringent criteria for its success. While performance increases the desirability among customers, cost is what makes it affordable. This is the reason why system designers aim for maximum performance with available resources such as power and area constraints. There are two related parameters that determine the performance output of a system –

Throughput - Throughput is a measure of the productivity of the system. In electronic/communication systems, throughput refers to rate at which output data is produced. Higher the throughput, more productive is the system. In most of the cases, it is measured as time difference between two consecutive outputs (nth and n+1th). Throughput also refers to the rate at which input data can be applied to system.
Let us discuss with the help of an example:

throughput summary diagram


Above figure depicts the throughput of 3 number adder. Result of input set applied at 1st clock cycle appears at output at 3rd clock cycle and in 4th clock cycle next input set is applied and output comes in 6th clock cycle.  Hence, throughput of above design is ⅓ per clock cycle. As we can see from diagram, first input is applied in first clock cycle and 2nd input is applied in 4th clock cycle. Hence we can also say that throughput is rate at which input data can be applied to system.

Latency- Latency is the time taken by a system to produce output after input is applied. It is a measure of delay response of a design. Higher the latency value, slower is the system. in synchronous designs, it is measured in terms of number of clock cycles. In combinational designs, latency is basically propagation delay of circuit. In non pipelined designs, latency improvement is major area of concern. In more general terms, it is time difference between output and input time.
Latency
Relationship between throughput and latency: Both latency and throughput are inter-related. It is desired to have maximum throughput and minimum latency. Increasing latency and/or throughput might make the system costly. Let us take an example. Consider a park with 3 rides and it takes 5 minutes for a ride.  A child can take sequentially these rides; i.e, ride 1, ride 2 and then ride 3. Firstly, let us assume that only one child at a time is allowed to enter park at a time. While he is taking a ride, no one is allowed to enter the park. Thus, the throughput of the park is 15 minutes per child and latency is 15 minutes. Now, let us assume that while a child has finished taking ride1, another child is allowed to enter park. Thus, in this case, throughput will be 5 minutes per child whereas latency is still 15 minutes. Thus, we have increased the throughput of the system without affecting latency and at the same cost.

What is Logic Built-in Self Test (LBIST)

LBIST stands for Logic Built-In Self Test. As VLSI marches to deep sub-micron technologies, LBIST is gaining importance due to the unique advantages it provides. LBIST refers to a self-test mechanism for testing random logic. The logic can be tested with no intervention from the outside world. In other words, a piece of hardware and/or software is inbuilt into an integrated circuit to test itself. By random logic, is meant any form of hardware (logic gates, memories etc.) that can form a part or whole of the chip. A generic LBIST system is implemented using STUMPS (Self-Test Using MISR and PRPG) architecture. A typical LBIST system is as shown in the figure below:

A typical LBIST system consists of a PRPG, An LUT and a MISR controlled by an LBIST controller
Figure 1: A typical LBIST system


Components of an LBIST system: A typical LBIST system comprises following:
  1. Logic to be tested, or, as is called Circuit Under Test (CUT): In case of LBIST, the logic to be tested through LBIST is the Circuit under Test (CUT). Any random logic residing on the chip can be brought under LBIST following a certain procedure.
  2. PRPG (Pseudo-Random Pattern Generator): A PRPG generates input patterns that are applied to internal scan chains of the CUT for LBIST testing. In other words, PRPG acts as a Test Pattern Generator (TPG) for LBIST. A PRPG can either use a counter or an LFSR for pattern generation.
  3. MISR (Multi-Input Signature Register): MISR obtains the response of the device to the test patterns applied. An incorrect MISR output indicates a defect in the CUT. In classical language, MISR acts as a ORA (Output Response Analyzer) for LBIST testing.
  4. A master (LBIST controller): The controller controls the functioning of the LBIST; i.e. clocks propagation, initialization and scan patterns flow in and out of the LBIST scan chains.

One of the most stringent requirements in LBIST testing is the prohibition of X-sources. There cannot be any source of ‘X’ during LBIST testing. By ‘X’, is meant a definite, but unknown value. It might be either ‘0’ or ‘1’, but it is not known what value is being propagated. All X-sources are masked and a known value is allowed to be propagated in LBIST.

Why ‘X’ is prohibited in LBIST: As stated above, there cannot be any ‘X’ propagating during LBIST testing. The reason behind this is that LBIST involves MISR to calculate the signature of the LBIST patterns. Since, the resulting signal is unique, any unknown value can result in the corruption of the signature. So, there cannot be any ‘X’ in LBIST testing.

Advantages of LBIST: As stated above, there are many unique advantages of LBIST that make it desirable, especially in safety critical designs such as those used in automobiles and aeroplanes. LBIST offers many advantages as listed below:
  • LBIST provides self-test capability to logic inside chip; thus, the chip can test itself without any external control and interference.
  • This provides the ability to be tested at higher frequencies reducing test time considerably.
  • LBIST can run while the chip is on field running functionally. Thus, it is very useful in safety critical applications wherein faults developed on field can be easily detectable at startup before chip goes into functional mode.

Overheads due to LBIST: Along with many advantages, there are some overheads due to LBIST as mentioned below:
(i)                  The LBIST implementation involves some hardware on-chip to control LBIST. So, there are area and power impacts due to these. In other words, the cost of chip increases.
(ii)                Also, ‘X’-masking involves addition of extra logic gates in already timing critical functional signals causing impact on timing as well.
(iii)               Another disadvantage of using LBIST is that even the on-chip test equipment may fail. This is not the problem with testing using outside equipment with proven test circuitry
References:
  1. Identification and reduction of safe-stating points in LBIST designs 
  2. Logic built-in self-test
  3. Challenges in LBIST verification of high reliability SoCs
Also read:

Worst Slew Propagation


Worst slew propagation is a phenomenon in Static Timing Analysis. According to it, the worst of the slews at the input pin of a gate is propagated to its output. As we know, the output slew of a logic cell is a function of its input slew and output load. For a multi-input logic gate, the output slew should be different for the timing paths through its different input pins. However, this is not the case. This is due to the reason that to maintain a timing grapth, each node in the design can have only 1 slew. So, to cover the worst scenario for setup timing, the maximum slew at each output pin should be equal to that caused by the input pin having worst of the slews. The output slew calculated is on the basis of worst input slew, even if the timing path for which the output slew is being calculated is not through the input pin with worst slew. Similarly, the best of the slews is calculated based upon the effect of all the input pins for hold timing analysis. We can refer to it as best slew propagation.

Let us illustrate with the help of a 2-input AND gate. As shown in figure below, let the slews at the input pins be denoted as SLEW_A and SLEW_B and that at the output pin as SLEW_OUT. Now, as we know:

SLEW_OUT = func (SLEW_A) if A toggles leading to OUT toggling
And SLEW_OUT = func (SLEW_B) if B toggles leading to OUT toggling

However, even though the timing path as shown through A pin, the resultant slew at output SLEW_OUT will be calculated as:

SLEW_OUT         =  func (SLEW_A) if func(SLEW_A) > func(SLEW_B)

                                =  func (SLEW_B) if func(SLEW_B) > func(SLEW_A)



Worst slew propagation is carried out through the worst of all the slews caused by each input pin
Figure 1: Figure showing worst slew propagation

One may feel this as an over-pessimism inserted by timing analysis tool. Path based timing analysis will not have worst slew propagation phenomenon as it calculates output slew for each timing path rather than one slew per node. 

Similarly, for performing timing analysis for hold violations, the best of the slews at inputs is propagated to the output as mentioned before also. 

Also read:



Depletion MOSFET and negative logic. Why it is not possible?


As we know, depletion MOSFET conducts current even with gate and source at same voltage level. To cut-off the current in depletion MOSFET, a voltage has to be applied at gate so as to exhaust the already existing carriers inside the channel. On the other hand, enhancement type MOSFET is cut-off when gate and source are at same voltage.
Taking the example of NMOS, for a depletion MOS, with source and gate at same level, there is still a channel available, hence, it conducts electric current. To bring it to cut-off, a negative potential is needed to be applied at gate (considering source at ‘X’ potential). Thus, with source at ‘X’ potential and gate at ‘X’ potential, drain attains the potential of source. Since, to cut-off the device, gate has to be given a voltage less than ‘X’, so we can say “when Gate is 1 and source is 1, then drain is 1”.  On the other hand, when source is 1 and gate is 0, drain attains ‘high impedance’. The reverse is true for PMOS.
Similarly, with the same logic, for an enhancement NMOS, “When Gate is 1 and source is 0, drain attains 0 potential”; similarly, “When Gate is 0 and source is 0, drain is 0”. The reverse is true for PMOS.

Source voltage
Gate voltage
Drain voltage for enhancement NMOS
Drain voltage for enhancement PMOS
Drain voltage for depletion NMOS
Drain voltage for depletion PMOS
0
0
Z
Z
0
0
0
1
0
Z
0
Z
1
0
Z
1
Z
1
1
1
Z
Z
1
1

Thus, we can say that it is due to the inherent properties of NMOS and PMOS that that they cannot be used to create negative level logic.

Setup check and hold check for flop-to-latch timing paths



In the post (Setup and hold – basics of timinganalysis), we introduced setup and hold timing requirements and also discussed why these requirements are needed to be applied. In this post, we will be discussing how these checks are applied for different cases for paths starting from flops and ending at latches and vice-versa.
Present day designs are focused mainly on the paths between flip-flops as the elements dominating in them are flip-flops. But there are also some level-sensitive elements involved in data transfer in current-day designs. So, we need to have knowledge of setup and hold checks for flop-to-latch and latch-to-flop paths too. In this post, we will be discussing the former case. In totality, there can be total 4 cases involved in flop-to-latch paths as discussed below:
1)                  Paths launching from positive edge-triggered flip-flop and being captured at positive level-sensitive latch: Figure 1 shows a path being launched from a positive edge-triggered flop and being captured on a positive level-sensitive latch. In this case, setup check is on the same rising edge (without time borrow) and next falling edge (with time borrow) and hold check on the previous falling edge with respect to the edge at which data is launched by the launching flop.


Positive edge flop to positive level latch path

Figure 1: Timing path from positive edge flop to positive level latch


Figure below shows the waveforms for setup and hold checks in case of paths starting from a positive edge triggered flip-flop and ending at a positive level sensitive latch. As can be figured out, setup and hold check equations can be described as:
            Tck->q + Tprop + Tsetup < Tskew + Tperiod/2                                (for setup check)
            Tck->q + Tprop > Thold + Tskew    - (Tperiod/2)                       (for hold check)

Waveform showing setup check for positive flop to negative level latch is on next positive edge and hold check is on next negative edge.


2)                  Paths launching from positive edge-triggered flip-flop and being captured at negative level-sensitive latch: Figure 3 shows a path starting from positive edge-triggered flip-flop and being captured at a negative level sensitive latch. In this case, setup check is on the next falling edge (without time borrow) and on next positive edge (with time borrow). Hold check on the same edge with respect to the edge at which data is launched (zero cycle hold check).
 
Path starting from positive edge triggered flop and ending at negative level sensitive latch

Figure 3: Path from positive edge flop to negative level latch


Figure below shows the waveforms for setup and hold checks in case of paths starting from a positive edge triggered flip-flop and ending at a negative level-sensitive latch. As can be figured out, setup and hold check equations can be described as:
            Tck->q + Tprop + Tsetup < (Tperiod) + Tskew                        (for setup check)
            Tck->q + Tprop > Thold + Tskew                                                   (for hold check)
Setup check for positive flop to negative level latch is on next positive edge and hold check is on next negative edge.

 
3)                  Paths launching from negative edge-triggered flip-flop and being captured at positive level-sensitive latch: Figure 5 shows a path starting from negative edge-triggered flip-flop and being captured at a positive level sensitive latch. In this case, setup check is on the next rising edge (without time borrow) and next falling edge (with time borrow). Hold check on the next rising edge with respect to the edge at which data is launched.
 
Path starting from negative edge triggered flip-flop and ending at positive level sensitive latch

Figure 5: Path from negative edge flop to positive level latch


Figure below shows the waveforms for setup and hold checks in case of paths starting from a negative edge triggered flip-flop and ending at a positive level-sensitive latch. As can be figured out, setup and hold check equations can be described as:

            Tck->q + Tprop + Tsetup < (Tperiod) + Tskew                        (for setup check)
            Tck->q + Tprop > Thold + Tskew                                                   (for hold check)


Setup check for positive flop to negative level latch is on next positive edge and hold check is on next negative edge.

1)                  Paths launching from negative edge-triggered flip-flop and being captured at negative level-sensitive latch: Figure 5 shows a path starting from negative edge-triggered flip-flop and being captured at a negative level sensitive latch. In this case, setup check is on the same edge (without time borrow) and on next rising edge (with time borrow). Hold check on the next rising edge with respect to the edge at which data is launched.

 
Path starting from negative edge-triggered flop and ending at negative level-sensitive latch

Figure 5: Path from negative edge flop to negative level latch

Figure below shows the waveforms for setup and hold checks in case of paths starting from a negative edge triggered flip-flop and ending at a negative level-sensitive latch. As can be figured out, setup and hold check equations can be described as:
            Tck->q + Tprop + Tsetup < Tperiod/2 + Tskew                             (for setup check)
            Tck->q + Tprop > Thold + Tskew   - (Tperiod/2)                       (for hold check)


Setup check for positive flop to negative level latch is on next positive edge and hold check is on next negative edge.