CMOS based Power Efficient Digital Comparator with Parallel Prefix Tree Structure

A 128-Bit Digital Comparator is designed with Digital Complementary Metal Oxide Semiconductor (CMOS) logic, with the use of Parallel Prefix Tree Structure [1] technique. The comparison is performed on Most Significant Bit (MSB) to the Least Significant Bit (LSB). The comparison for the lower order bits carried out only when the MSBs are equal. This technique results in Optimized Power consumption and improved speed of operation. To make the circuit regular, the design is made using only CMOS logic gates. Transmission gates were used in the existing design and are replaced with the simple AND gates. This 128-Bit comparator is designed using Cadence TSMC 0.18µm technology and optimized the Power dissipation to 0.28mW and with a Delay of 0.87μs.


Introduction
Comparator is a logic circuit that is used to compare the magnitude of two numbers. Comparators are the key design elements in many scientific and mathematical applications. It is used widely in scientific applications, test circuits and analysis of signatures etc. Previous comparators are designed using adder architectures. These architectures suffer from speed and area issues. Multiplexer based comparator architectures a headed the design of comparator architectures that made use of adders. The multiplexer based comparators divide the n -bit input into two n/2 bits and the result of two n/2 comparators is fed to the multiplexer that provide the result of the comparison with un-optimized power consumption [9]- [13].The comparators are designed using All N Transistor (ANT) circuits, but all the NMOS transistors that are connected in series enter saturation mode during operation which increases over all conductive resistance [13]. Some uses priority encoder architectures. These architectures split the n bit input into two n/2 bits and the result of two n/2 comparators is taken as input by the priority encoder so that it considers the MSB priority first [11]. In this work, uses Parallel Prefix Structure to develop architecture, the n bits are divided into n/4 modules each module compares 4 bits. The comparison is carried out from MSB to the LSB. The comparison is carried to further bits only when the MSBs are equal. The decision can be taken in the initial module and then the next modules will not perform comparison operation thereby saving the power. The 128-bit comparator is separated into two sub modules i.e., comparison module and decision module as shown in Fig. 1.1 The comparison is performed bit by bit using the comparison module of 128 bits. The input variables A and B are denoted as A127, A126, A125………A2, A1, A0 and B127, B126……. B2, B1, B0. Bit wise comparison is advanced from MSB to LSB, so that the comparison is triggered only when the MSBs are equal. The comparison module encodes the comparison bits into two buses that are right bus and left bus such that each bus stores the intermediate result [2]. Each bit is compared such that If An>Bn then right n=1 and left n = 0 An>Bn then right n=0 and left n=1 An=Bn then right n=0 and left n=0 The module which makes decision uses OR network to make a conclusion based on the bits that are stored in the bus.

Architecture
The comparator architecture comprises of the decision module and comparison module. The comparison module performs the comparison for the give input bits. As we are designing a 128-bit comparator the 128 bits are grouped into 32 groups of 4 bits and each four bits are compared in a single 4-bit module and we have 32 modules each comparing 4 individual bits. Each 4-bit comparator module takes two 4-bit input operands one signal from the previous comparator module that enables the comparison of that module. Each single 4-bit comparator module has 4 outputs that are used by the decision module for making decision at the output and one enable output that acts as enable input for the next comparator module that triggers the comparison. The decision module gets its input from the 4-bit comparison module. Each comparison module gives 4 outputs for 4 input bits. In this way, we get a total of 128 outputs from the comparison module each single output for 128 inputs. The distinct four inputs are combined to get a single output [14][15][16][17][18][19][20][21][22]. This procedure is tracked until we get our last 3 outputs. Each 4-bit comparator segment of the comparison module is again divided into five hierarchical sets that perform the comparison operation in a specified manner. We partition the comparator resolution module structure into five hierarchal prefixing sets.

Comparator Design
The comparison module relates each individual bit using a tree structure. In the comparison module, we utilize five Sets of elements . Set -1 performs basic comparison of two individual bits of A and B. The output of set-1 acts as the input of set-2. Four set-1 outputs are combined to give a decision regarding the individual four input bits by set-2 element. Set-3 gets the input from set-2 and Set-4 gets the input from set-3. The output of set-4. element acts as enable input to set-5 element. The output of set-5 element forms left and right bus bits. The right and left bus bits from the comparison module are given to the decision module which performs oroperation and makes a final decision Set 1 compare A and B bit wise. The set-1 elements provide a output Di to elements in set-2 and set-4 which is used to abort the operation, if it is high the operation is proceeded else the operation is aborted. These results in computing a XOR operation as represented in equation (1).
Set 3 consists of elements, which resembles set _2 type elements in their functionality but will have more logic levels. Set_3 type elements do not perform comparison. The elements prime function is to bound the fan-in and fan-out irrespective of number of bits in input. Set-3 makes use of AND gates to progress or abort the operation. If the operation is aborted, output from set-3 makes following elements to set the left bus bit to '0' and right bus bit to '0' for all lower order bits.   Set 5 accomplishes the task of a multiplexer. It shows whether the particular bit of 'A' is greater or whether the particular bit of 'B 'is greater and provides 2-bit output. The selection input relies on the output of set-4. We describe the 2-b as the right bit and left bit rili . The output F denotes greater, less or equal as a final output.
In the existing design the functionality of multiplexer is designed using transmission gates such that the static power dissipation is reduced far better [1]. But whenever we consider the total power dissipation the static power dissipation accounts for very less amount that is it is in the order of micro watts whereas the dynamic power dissipation will be in the order of milli watts so decreasing the static power does not account much for the circuit instead it makes the design difficult and irregular as all the four sets are designed using gate level design and only the set 5 elements are designed using transistor level. It is not possible to design the entire circuit in gate level. Therefore, instead of using transmission gate as set-5 elements we go for a gate level design      To evaluate the performance of comparator, we simulated the complete design with various inputs using the cadence with 0.18 μm-TSMC digital CMOS technology. The worst-case delay occurs when all the most significant bits are logic low and the least significant bit is logic high. Because in this condition almost all the cells are activated which does not uses the advantage of parallel tree structure.

Fig.4.7 Power dissipation and delay results
Type 3 cells have a extreme fan-in of five and extreme fan-out of four. We analyzed the performance of comparator with different number of inputs and evaluated the leakage power and delay. As the number of bits upsurges the delay and power dissipation rise linearly as discussed in [1], [2], [3]. The results for our 8-b comparator, 16-b comparator, 32-b comparator,64-b comparator, 128-b comparator and reported results and tabulated in table 4.1

Conclusion
We have designed 128-bit digital cmos comparator using digital cmos cells. This architecture consists of parallel structure which helps in replicating the design that supports VLSI reconfigurable topology. It yields a power efficient comparator structure when compared with the previous comparator structures that make use of fast adders and multiplexers. The comparator dissipates a power of 0.28 mw and has a delay of 0.087ms.