DESIGN OF EFFICIENT SHIFT REGISTERS USING PULSED LATCHES

M. AJAY
ajaymunagala.ajay@gmail.com
1PG Scholar, Dept of ECE, Sreenivasa Institute of Technology and Management Studies (Autonomous) Murkambattu, Chittoor, JNTU Anantapur University

G.SRIHARI
srihari.nan@gmail.com
2Associate Professor, Dept of ECE, Sreenivasa Institute of Technology and Management Studies (Autonomous) Murkambattu, Chittoor JNTU Anantapur University

Abstract: This paper proposes a low-power and area-efficient shift register using digital pulsed latches. The area and power consumption are reduced by replacing flip-flops with pulsed latches. This method solves the timing problem between pulsed latches through the use of multiple non-overlap delayed pulsed clock signals instead of the conventional single pulsed clock signal. The shift register uses a small number of the pulsed clock signals by grouping the latches to several sub shifter registers and using additional temporary storage latches. A 256-bit shift register using pulsed latches was fabricated using a 0.18μm CMOS process with VDD = 1.8V. The core area is 6600μm2. The power consumption is 1.2mW at a 100 MHz clock frequency. The proposed shift register saves 37% area and 44% power compared to the conventional shift register with flip-flops. In digital circuits, a shift register is a cascade of flip-flops, sharing the same clock, in which the output of each flip-flop is connected to the “data” input of the next flip-flop in the chain, resulting in a circuit that shifts by one position the “bit array” stored in it, shifting in the data present at its input and shifting out the last bit in the array, at each transition of the clock input. More generally, a shift register may be multidimensional, such that its “data in” and stage outputs are themselves bit arrays: this is implemented simply by running several shift registers of the same bit-length in parallel.

Keywords: Area-efficient, flip-flop, pulsed clock, pulsed latch, shift register

I. Introduction

Flip-flops are the basic storage elements used extensively in all kinds of digital designs. As the feature size of CMOS technology process scaled down according to Moore’s Law, designers are able to integrate many numbers of transistors onto the same die. The more transistors there will be more switching and more power dissipated in the form of heat or radiation. Heat is one of the phenomenon packaging challenges in this epoch, it is one of the main challenges of low power design methodologies and practices. Another driver of low power research is the reliability of the integrated circuit. More switching implies higher average current is expelled and therefore the probability of reliability issues occurring rises. We are moving from laptops to tablets and even smaller computing digital systems. With this profound trend continuing and without a match trending in battery life expectancy, the more low power issues will have to be addressed. The current trends will eventually mandate low power design automation on a very large scale to match the trends of power consumption of today’s and future integrated chips. Power consumption of Very Large Scale Integrated design is given by Generalized relation, \( P = CV^2f \) [1]. Since power is proportional to the square of the voltage as per the relation, voltage scaling is the most prominent way to reduce power dissipation. However, voltage scaling is results in threshold voltage scaling which bows to the exponential increase in leakage power.

Though several contributions have been made to the art of single edge triggered flip-flops, a need evidently occurs for a design that further improves the performance of single edge triggered flip-flops patterns. The architecture of a shift register is quite simple. An N-bit shift register is composed of series connected N data flip-flops. The speed of the flip-flop is less important than the area and power consumption because there is no circuit between flip-flops in the shift register. The smallest flip-flop is suitable for the shiftregister to reduce the area and power consumption. Recently, pulsed latches have replaced flip-flops in many applications, because
a pulsed latch is much smaller than a flipflop. But the pulsed latch cannot be used in a shift register due to the timing problem between pulsed latches.

II. Shift Registers

A shift register is the basic building block in a VLSI circuit. Shift registers are commonly used in many applications, such as digital filters, communication receivers and image processing ICs. Recently, as the size of the image data continues to increase due to the high demand for high quality image data, the word length of the shifter register increases to process large image data in image processing ICs.

An image-extraction and vector generation VLSI chip uses a 4K-bit shift register. A 10-bit 208 channel output LCD column driver IC uses a 2K-bit shift register. A 16-megapixel CMOS image sensor uses a 45K-bit shift register. As the word length of the shifter register increases, the area and power consumption of the shift register become important design considerations. The smallest flip-flop is suitable for the shift register to reduce the area and power consumption. Recently, pulsed latches have replaced flipflops in many applications, because a pulsed latch is much smaller than a flip-flop. But the pulsed latch cannot be used in a shift register due to the timing problem between pulsed latches.

III. Proposed Architecture

This paper proposes a low-power and area-efficient shift register using pulsed latches. The shift register solves the timing problem using multiple non-overlap delayed pulsed clock signals instead of the conventional single pulsed clock signal. The shift register uses a small number of the pulsed clock signals by grouping the latches to several sub shifter registers and using additional temporary storage latches. Shift registers can have both parallel and serial storage latches. Shift registers can have both parallel and serial storage latches. Shift registers can have both parallel and serial storage latches.

These are often configured as ‘serial-in, parallel-out’ (SIPO) or as ‘parallel-in, serial-out’ (PISO). There are also types that have both serial and parallel input and types with serial and parallel output. There are also ‘bidirectional’ shift registers which allow shifting in both directions: L→R or R→L. The serial input and last output of a shift register can also be connected to create a ‘circular shift register’. Previous work often measured energy consumption using a limited set of data patterns with the clock switching every cycle. But real designs have a wide variation in clock and data activity across different TE instances. For example, lowpower microprocessors make extensive use of clock gating resulting in many TEs whose energy consumption is dominated by input data transitions rather than clock transitions. Other TEs, in contrast, have negligible data input activity but are clocked every cycle. Shift registers, like counters, are a form of sequential logic. Sequential logic, unlike combinational logic is not only affected by the present inputs, but also, by the prior history. In other words, sequential logic remembers past events. Pulsed latch structures employ an edge-triggered pulse generator to provide a short transparency window. Compared to master–slave flip-flops, pulsed latches have the advantages of requiring only one latch stage per clock cycle and of allowing time-borrowing across cycle boundaries. The major disadvantages of pulsed latch structures are the increased susceptibility to timing hazards and the energy dissipation of the local clock pulse generators.

This paper proposes a low-power and area-efficient shift register using pulsed latches. The shift register solves the timing problem using multiple non-overlap delayed pulsed clock signals instead of the conventional single pulsed clock signal. The shift register uses a small number of the pulsed clock signals by grouping the latches to several sub shifter registers and using additional temporary storage latches. Shift registers can have both parallel and serial storage latches. Shift registers can have both parallel and serial storage latches.
half of those of the master-slave flip-flop. The pulsed latch is an attractive solution for small area and low power consumption.

The pulsed latch cannot be used in shift registers due to the timing problem, as shown in Fig. 2. The shift registers in Fig. 2(a) consists of several latches and a pulsed clock signal (CLK_pulse). The operation waveforms in Fig. 2(b) show the timing problem in the shifter register. The output signal of the first latch (Q1) changes correctly because the input signals of the first latch (IN) is constant during the clock pulse width (TPULSE). But the second latch has an uncertain output signal (Q2) because its input signal (Q1) changes during the clock pulse width.

Fig. 2. Shift register with latches and a pulsed clock signal. (a) Schematic. (b) Waveforms

One solution for the timing problem is to add delay circuits between latches, as shown in Fig. 3(a). The output signal of the latch is delayed and reaches the next latch after the clock pulse. As shown in Fig. 3(b) the output signals of the first and second latches (Q1 and Q2) change during the clock pulse width, but the input signals of the second and third latches (D2 and D3) become the same as the output signals of the first and second latches (Q1 and Q2) after the clock pulse. As a result, all latches have constant input signals during the clock pulse and no timing problem occurs between the latches. However, the delay circuits cause large area and power over heads.

Fig. 3. Shift register with latches, delay circuits, and a pulsed clock signal. (a) Schematic. (b) Waveforms

A 4-bitsub shifter register consists of five latches and it performs shift operations with five non overlap delayed pulsed clock signals (CLK_pulse<1:4> and CLK_pulse<T>). In the 4-bit sub shifregister #1, four latches store 4-bit data (Q1-Q4) and the last latch stores 1-bit temporary data (T1) which will be stored in the first latch (Q5) of the 4-bit sub shift register #2. Fig. 4(b) shows the operation waveforms in the proposed shift register.

Fig. 4(b) shows the operation waveforms in the proposed shift register.
The numbers of latches and clock-pulse circuits change according to the word length of the sub shift register. The selection is based on considering the area, power consumption, and speed.

Area optimization: The area optimization can be performed as follows. When the circuit areas are normalized with a latch, the areas of a latch and a clock-pulse circuit are 1 and \( \frac{1}{K} \), respectively. The total area becomes \( \alpha_A \times (K + 1) + N \left( 1 + \frac{1}{K} \right) \). The optimal \( K = \sqrt{N/\alpha_A} \) for the minimum area is obtained from the first-order differential equation of the total area \( 0 = \alpha_A - N/K \). An integer for the minimum area is selected as a divisor of \( \sqrt{N/\alpha_A} \).

Power optimization: The power optimization is similar to the area optimization. The power is consumed mainly in latches and clock-pulse circuits. Each latch consumes power for data transition and clockloading. When the circuit powers are normalized with a latch, the power consumption of a latch and a clock-pulse circuit are 1 and \( \frac{1}{K} \), respectively. The total power consumption is also \( \alpha_P \times (K + 1) + N \left( 1 + \frac{1}{K} \right) \). An integer for the minimum power is selected as a divisor of \( \sqrt{N/\alpha_A} \).

Chip Implementation: The maximum clock frequency in the conventional shift register is limited to only the delay of flip-flops because there is no delay between flip-flops. Therefore, the area and power consumption are more important than the speed for selecting the flip-flop. The proposed shift register uses latches instead of flip-flops to reduce the area and power consumption.

IV. CONCLUSION

This paper proposed a low-power and area-efficient shift register using pulsed latches. The shift register reduces area and power consumption by substituting flip-flops with pulsed latches. The timing problem between pulsed latches is solved using multiple non-overlap delayed pulsed clock signals as an alternative of a single pulsed clock signal.

REFERENCES


