

# Design of Area and Power Optimization Shift Register

<sup>[1]</sup> Akshata G. Shete, <sup>[2]</sup> Aarti Gaikwad
<sup>[1][2]</sup> Dept. of Electronics and Telecommunication Engg
<sup>[1][2]</sup> D. Y. Patil College of Engineering, Akurdi, Pune, India

*Abstract* - This paper describes new technique for area and power evaluation 6T latch for shift register. In this technique area and power optimized by using pulsed latch instead of flip flop. Pulsed latch causes the timing problem which is overlapped in conventional single pulse clock. Here we are using non-overlapped delayed clock signal to solve this problem. The advanced portable devices require area and power efficient devices. The design is implemented with 65nm technology in Micro wind EDA (Electronic Design Automation) Tool. A n-bit shift register using pulsed latches is designed. The simulation results show that the proposed shift register design with less transistor count is better choice for low power and area efficient applications.

Index Terms: area-efficient, flip-flop, pulsed clock, pulsed latch, shift register.

#### **I.INTRODUCTION**

Latches and Flip flops are the basic storage elements used extensively in all kinds of digital designs. As the feature size of CMOS technology process scaled down according to Moore's Law, designers are able to integrate many numbers of transistors onto the same die. The more transistors there will be more switching and more power dissipated in the form of heat or radiation. Heat is one of the phenomenon packaging challenges in this epoch, it is one of the main challenges of low power design methodologies and practices. Another driver of low power research is the reliability of the integrated circuit. Here concept is how the latches will be worked and how we are shifted data through the latches because we are shifted the data by using flip flop means it will take more delay and more power compared to latches because we shifted the data on rising edge and falling edge based in flip flop. The data shifted on rising edge means its positive logic or in falling edge means its negative edge. Here the data will be shifted on both of falling and rising edge based upon the pulse will be generated from the latches. The smallest flip-flop is suitable for the shift register to reduce the area and power consumption. Recently, pulsed latches have replaced flip-flops in many applications, because a pulsed latch is much smaller than a flip-flop [6]–[9]. But the pulsed latch cannot be used in a shift register due to the timing problem between pulsed latches The shift register solves the timing problem using multiple non overlap delayed pulsed clock signal instead of conventional single pulsed clock signal, timing problem is solved using delayed circuits. As a result each latch has a constant

input during its clock pulse and no timing problem occurs between the latches. However, this solution also requires many delayed circuits and increased clock power consumption. The pulsed latch is an attractive solution for small area and low power consumption.

Let us take an example of a basic flip-flop and a latch. Where a flip-flop is used as a storage element and latch is considered as smaller flip-flop. Recently, in many applications flip-flops are replaced by the pulsed latches. Because the pulsed latch is much smaller than a flip-flop. The shift register uses a multiple nonoverlap delayed pulsed clock signal instead of a single pulsed clock signal to reduce the timing problem between pulsed latches.



Figure 1. Master slave d flip-flop



Figure 2. Pulsed latch



Here figure 1 consists of master slave flip-flop and figure2 consist of pulsed latch. Master slave flip-flop takes much time compared to latch because it has to apply a clock to master flip-flop and slave flip-flop and power required by the master slave (ms) flip-flops twice the pulsed latch. The pulsed latch is much smaller in size compared to Master slave D flip-flop and its performance and power consumption is also better compared to ms flip-flop Here the flip-flop covers more than 50% of area compared to pulsed latch. These are the main reasons why we go for a pulsed latch instead of master slave flip-flop.

## **II. EXISTING METHOD**

The PowerPC master-slave latch is one of the fastest classical structures. Figure 3 shows schematic of PPCFF The performance of PPCFF is compared with SSASPL where PPCFF is the best among flip-flops. It is called best because it has good performance and uses small number of transistors among flip-flops. POWER-PC-STYLE flip-flop (PPCFF).When counting the total number of transistors for generating the differential clock signals and pulsed clock signals are not included because they share in all latches and flip-flops. Its main advantages are a short direct path and low-power feedback. The PPCFF is best in terms of performance compared to other flip-flops such as CCPPCFF. Traditionally, the power consumption of flip-flop and latch designs has been measured using an un gated clock and a small number of input activation patterns. Instead, we adopt a more accurate methodology in which all possible states (e.g., clock value, input value, output value) of the TE are enumerated and the energy consumption of each state transition is measured.



Figure 3. Schematic of PPCFF

The SSASPL with 9transistors is modified to the SSASPL with 7 transistors in Figure 4 by removing an inverter to generate the complementary data input (Db) from the data input(D).In the proposed shift

register, the differential data inputs(D and Db) of the latch come from the differential data outputs(Q and Qb) of the previous latch. It has a single transistor driven by the pulsed clock signal

The SSASPL (static differential sense amp shared pulse latch) which is the smallest latch is selected. The original SSASPL with 9 transistors is modified and selected to design In the proposed shift register, the differential data inputs (D and Db) of the latch come from the differential data outputs (Q and Qb) of the previous latch. The SSASPL uses the smallest number of transistors (7 transistors) and it consumes the lowest clock power because it has a single transistor driven by the pulsed clock signal. The SSASPL updates the data with three NMOS transistors and it holds the data with four transistors in two cross-coupled inverters. It requires two differential data inputs (D and Db) and a pulsed clock signal. When the pulsed clock signal is high, its data is updated. The node Q or Qb is pulled down to ground according to the input data. The pulldown current of the NMOS transistors must be larger than the pull-up current of the PMOS transistors in the inverters



### Figure 4 : Schematic of SSASPL

This figure shows the schematics and a small number of transistors. The ssaspl flips the states of the cross coupled inverters (q and qb) by pulling current down through either m2 or m3during the clock pulse width. The clock pulse width is selected as minimum time to turn over the output signals of the latch (Q and Qb) When its input signals (D and Db) are constant. If the input signals change during the clock pulse width, the time pulling current down through either M2 or M3 becomes shorter than the clock pulse width, so that the latch has not enough clock pulse time to flip the output signals after the input signals change. The proposed shift register consists of latches and delayed pulse clock generator which is used to delay the pulses for a specific latch and perform shifting of bits .As the bits get shifted the delayed clock pulse will also get shifted. This shift register is designed by latches. This



is an 8bit shift register as shows in figure 5. Here we have two sub shift registers and a delayed clock pulse generator which is used to delay clock pulses. When the input is given to the first latch it gives the output to second latch and same bit is transmitted to the next, in this way it sends to 4 latches and the fifth latch is used to store the output of four latches and that output is fed to next sub shift register and the same continues as the first.

This shift register consists of sub shift register, pulse clock generator, and a clock. The sub shift register consist of five latches among which (Q1-Q4) are used for the purpose of shifting and Q5 is used as an additional latch or temporary storage latch. This additional latch is used to store the information from the remaining (Q1-Q4) latches.



Figure 5. Schematic of 8-bit shift register using latch

So that the number of delayed clock pulses may be reduced. The proposed shift register may be divided in to M sub shift registers to reduce the number of delayed pulsed clock signals. A 4-bit sub shift register consists of five latches and performs the shift operation with 4 non-overlap delayed pulsed clock signal and another delayed pulsed clock signal is used for the temporary storage.after the shifting of bits the result is stored in temporary storage latch which is fed as input to the second sub shift register.

Figure 6 shows Schematic of delayed pulse clock generator. Five non-overlap delayed pulse clock signals are generated by the delayed pulse clock generator. Working of delayed pulsed clock generator is as follows. Initially the pulsed clock cp5update the latch data T1 from Q4and then pulsed clock signals (Q1-Q4) update the four latch data sequentially. The latches Q2-Q4recieve data from the latches Q1-Q3but the first latchQ1recieves data from input of the shift

register All digital delay generators measure time intervals by counting cycles of a fast clock (typically 100 MHz). Most digital delay generators also have short programmable analog delays to achieve time intervals with finer resolution than the clock period. Unfortunately, one clock cycle of timing indeterminacy (typically 10 ns) can occur if the trigger is not in phase with the clock



Figure 6. Schematic of delayed pulsed clock generator.

## III. PROPOSED SYSTEM

Here is one of design of latch for area and power consumption for shift register The basic idea behind the proposed design is to replace transmission gate logic by pass transistor logic in conventional 8T design. The proposed 6transistor design implements pass transistor logic for the transmission of data through it. The drain of first transistor PMOS\_1 are connected to the data input and this data will be available at the drain terminal only when the clock will be low in Figure 7. Since PMOS transistors are weak zero transistors, so small threshold loss is observed when data is zero. The output of this transistor is connected to the input of the first inverter. This inverter then inverts this data and also compensates the threshold loss, but not completely and thus less than the desired output loss is observed at the output. The next inverter again inverts the data and produces the output 'Q'. This output is feedback to the transistor NMOS 1.But overall performance of the device is almost unaffected because of the presence of the inverters. The transistor NMOS 1 passes the output according to the delayed version of the clock. Thus whenever clock is high, data is not passing through the transistor PMOS\_1 but output is again feedback through the circuit and output remains same. Thus this proposed clocked latch acts as a



negative level triggered flip flop. Whenever, clock is negative, the output changes with respect to data but remains constant as clock goes positive. The conventional shift register using flip-flops was implemented with the PPCFFs. Two types of the proposed shift register using pulsed latches and proposed 6T latches were implemented.



Fig. 7. Proposed 6 Transistor Latch

The proposed shift register achieves a small area and low power consumption compared to the conventional shift register.

## VI. EXPRIMENTAL RESULT

All the simulations are performed on Microwind3.5 and DSCH3.5. The main focus of this work is to meet all challenges faces in designing of shift register circuit with pulsed latches and proposed latch. The shift register reduces area and power consumption by using pulsed latch. The timing problem between pulsed latches is solved using multiple non-overlap delayed pulsed clock signals instead of a single pulsed clock signal. A small number of the pulsed clock signals is used by grouping the latches to several sub shifter registers and using additional temporary storage latches The simulation results are shown in below figures for 7T and 6T latch.



Figure 8. Layout of 8 bit shift register using 7T Pulsed Latch



Figure 9. Simulation of 8 bit shift register using 7T Pulsed Latch



Figure 10. Layout of 8-bit shift register using 6T Pulsed Latch



Figure 11. Simulation of 8 bit shift register using 6T Pulsed Latch

### **VII. CONCLUSION**

The simulation results of the conventional and proposed design demonstrated and found that the proposed 6-transistor latch is better in terms of power consumption and delay. Since the transistor count is



less, thus the proposed design is also area efficient. This paper proposed a low power and area-efficient shift register using pulsed latches. The shift register reduces area and power consumption by replacing flip-flops with pulsed latches. The timing problem between pulsed latches is solved using multiple non-overlap delayed pulsed clock signals instead of a single pulsed clock signals is used by grouping the latches to several sub shifter registers and using additional temporary storage latches. A 128-bit shift register was fabricated using a 65nm  $\mu$ m CMOS process with VDD = 1.0V. The proposed shift register saves area and power compared to the conventional shift register with flip-flops.

## REFERENCES

[1] P. Reyes, P. Reviriego, J. A. Maestro, and O. Ruano, "New protection techniques against SEUs for moving average filters in a radiation environment," IEEE Trans. Nucl. Sci., vol. 54, no. 4, pp. 957–964, Aug. 2007.

[2] M. Hatamian et al., "Design considerations for gigabit ethernet 1000 base-T twisted pair transceivers," Proc. IEEE Custom Integr. Circuits Conf., pp. 335–342, 1998.

[3] H. Yamasaki and T. Shibata, "A real-time image-feature-extraction and vector-generation vlsi employing arrayed-shift-register architecture," IEEE J. Solid-State Circuits, vol. 42, no. 9, pp. 2046–2053, Sep. 2007.

[4] H.-S. Kim, J.-H. Yang, S.-H. Park, S.-T. Ryu, and G.-H. Cho, "A 10-bit column-driver IC with parasitic-insensitive iterative charge-sharing based capacitor-string interpolation for mobile activematrix LCDs," IEEE J. Solid-State Circuits, vol. 49, no. 3, pp. 766–782, Mar. 2014.

[5] S.-H. W. Chiang and S. Kleinfelder, "Scaling and design of a 16-megapixel CMOS image sensor for electron microscopy," in Proc. IEEE Nucl. Sci. Symp. Conf. Record (NSS/MIC), 2009, pp. 1249–1256.

[6] S. Heo, R. Krashinsky, and K. Asanovic, "Activity-sensitive flip-flop and latch selection for reduced energy," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 15, no. 9, pp. 1060–1064, Sep. 2007. [7] S. Naffziger and G. Hammond, "The implementation of the nextgeneration 64 b itanium microprocessor," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2002, pp. 276–504.

[8] H. Partovi et al., "Flow-through latch and edgetriggered flip-flop hybrid elements," IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, pp. 138–139, Feb. 1996.

[9] E. Consoli, M. Alioto, G. Palumbo, and J. Rabaey, "Conditional push-pull pulsed latch with 726 fJops energy delay product in 65 nm CMOS," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2012, pp. 482–483

[10] V. Stojanovic and V. Oklobdzija, "Comparative analysis of masterslave latches and flipflops for high-performance and low-power systems," IEEE J. Solid-State Circuits, vol. 34, no. 4, pp. 536– 548, Apr. 1999.

[11] J. Montanaro et al., "A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor," IEEE J. Solid-State Circuits, vol. 31, no. 11, pp. 1703–1714, Nov. 1996.

[12] S. Nomura et al., "A 9.7 mW AACdecoding, 620 mW H.264 720p 60fps decoding, 8core media processor with embedded forwardbodybiasing and power-gating circuit in 65 nm CMOS technology," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2008, pp. 262–264.

[13] Y. Ueda et al., "6.33 mW MPEG audio decoding on a multimedia processor," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2006, pp. 1636–1637.

[14] B.-S. Kong, S.-S. Kim, and Y.-H. Jun, "Conditional-capture flip-flop for statistical power reduction," IEEE J. Solid-State Circuits, vol. 36, pp. 1263–1271, Aug. 2001.

[15] C. K. Teh, T. Fujita, H. Hara, and M. Hamada, "A 77% energy-saving 22-transistor single-phaseclocking D-flip-flop with adaptive-coupling configuration in 40 nm CMOS," in IEEE Int. SolidState Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2011, pp. 338–339.