# **Design of Low-Power Content Addressable Memory Cell**

KUO-HSING CHENG, CHIA-HUNG WEI, JIANN-CHYI RAU Department of Electrical Engineering Tamkang University Tamsui, Taipei County, TAIWAN 251, R.O.C.

Abstract: - Content Addressable Memory (CAM), a large amount of energy is generally expended charging and discharging most of the match lines on most cycles. In this paper, a new low-power CAM cell design is proposed to reduce the comparison power of CAM cell. Moreover, in the CAM word circuit design, a static pseudo nMOS logic structure with a precomputation approach is used to effectively avoid the frequently switching in the match lines. The HSPICE simulation results are based on TSMC 0.25  $\mu m$  CMOS process with 2.5 V supply voltage. The power consumption of the proposed CAM is 16.38 mW under 300 MHz operation frequency. Moreover, the power-performance metric is 13.33 fJ/bit/search for random inputs.

Key-Words: - low-power, static pseudo nMOS, fully parallel CAM

## **1** Introduction

Content addressable memory or associative memory, is a storage device, which can be addressed by its own contents. Each bit of CAM cells includes comparison logic. A data value input to the CAM is simultaneously compared with all the stored data. The match result is the corresponding address. CAM has a performance advantage over other memory search algorithms. This is due to the simultaneous comparison of the desired information against the entire list of prestored entries. CAM, especially fully parallel CAM, provides highly efficient hardware architecture for high-speed data searching topic. CAM function is used in wide range of applications such as lookup tables, databases and data compression [1][2]. Recently, in the high-speed network computing era, for example, gigabit Ethernet, asynchronous transfer mode (ATM) switches and high-speed lookup tables; higher speeds and capacities are needed to satisfy requirements of these leading-edge applications.

One major problems of a CAM design compared to a SRAM design is its complexity. There are extra transistor and extra wiring in each cell, needed for the searching capabilities. Another problem is the amount of the power consumption. All the elements in the CAM are accessed on every access, where as in RAM only the portion used is accessed. Hence, this paper is proposed to focus on reducing the power consumption of the CAM structure.

This paper presents a new CAM cell design and adopts the architecture of static pseudo nMOS CAM word circuit [3] that achieves low-power and high-reliability features. The concept of the static pseudo nMOS CAM word circuit is described in section 2, and followed by section 3, which introduces the conventional CAM cells and the proposed CAM cell design. Simulation results and conclusions are presented in section 4 and 5, respectively.

## 2 The Concept of the Static Pseudo NMOS CAM Word Circuit

In the traditional CAM architecture, the circuit design of CAM word structure is designed by the dynamic CMOS circuit [4] – [8] to improve overall system performance and hardware cost. Fig. 1 shows the traditional dynamic CAM word circuit. There are two main processes. First, in the precharge phase, the output node (Match Sense Node) is precharged to  $V_{dd}$  by the MP1 transistor. Second, the output node conditionally discharged by both MN1 and MN2 transistors connected to  $V_{ss}$  in the evaluate phase. Consequently, the output node is charged and discharged every cycle except for the matching line.



Fig. 1. Traditional dynamic CAM word circuit.

However, there are some drawbacks of the dynamic circuit design used in CAM word. 1) Extra precharge time to charge the output node to  $V_{dd}$ . 2) Low reliability, such as charge sharing and noise margin problems that are occurred in the general dynamic circuit design. 3) The heavy clock loading existed in the overall circuit. 4) High power consumption. In the traditional CAM architecture, with *m* words CAM size, there are (m-1) words are operated charge/discharge operations per data operation. 5) Need the sense amplifiers. Due to the output node voltage of the dynamic circuit is easily interfered by some factors, for example, noise, leakage current and charge sharing. Therefore, it needs a sense amplifier to correct and amplify the output voltage signal.

In order to eliminate the above drawbacks in the traditional CAM word structure, a static pseudo nMOS CAM word circuit is adopted in this paper as shown in Fig. 2.



Fig. 2. The static pseudo nMOS CAM word circuit.

The output node (V) of the valid bit connected to the both gates of MPP and MN1 transistors to indicate the word is valid or invalid. If the word is invalid (V = 1), then MN1 transistor is turned on and MPP transistor is turned off; therefore, the output signal (Data Match Line) equals to zero. Otherwise, (V = 0), MN1 transistor is turned off and MPP transistor is turned on. After that, the output signal is determined by the comparison result of CAM cells.

The main problem associated with the static pseudo nMOS CAM word circuit is the static power dissipation. In general, with m words CAM size, there are (m-1) words mismatching by comparing with input data. However, in this circuit design, the static power dissipation only occurred while all the pMOS transistors of the pMOS pull-up chain are turned on and at least one of MN2 transistors is also

turned on. Therefore, for this CAM word with random input signals, the static power dissipation probability is  $(m/2^{P}) - 1$  (where *P* is the number of partial bits.) For example, with a 128 words by 32 bits CAM size, if six partial bits are selected (P=6) then only one word circuit consumes the static power dissipation.

Comparing to the traditional dynamic CAM word design, the static pseudo nMOS CAM word circuit takes some advantages. 1) No extra precharge phase. 2) Effectively avoiding some circuit reliability problems, such as noise margin and charge sharing. 3) No clocking signal. Therefore, there is unnecessarily considered any kinds of clocking design problem, such as clock skew and clock distribution. 4) Reducing much of power consumption of the CAM structure. For an m words CAM size, the number of static power consuming CAM word circuits are equal to  $(m/2^{P})$  - 1. In addition, there is unnecessary the buffer that with large driving capability to drive the clock signal in the overall circuit. 5) No need the sense amplifier. Due to the output swing of the static circuit design is wide enough to drive the succeeding logics. Therefore, the sense amplifier can be replaced by the output buffer.

### **3** The CAM Cell Design

#### 3.1 Basic CAM Cell

The basic CAM cell is shown in Fig. 3. It comprises a standard six-transistor SRAM cell, two XOR transistors used for comparison, and an nMOS pull-down device to drive the word MatchLine.



Fig. 3. Basic nine-transistor CAM cell.

In the write operation, the two pass transistors, MN1 and MN2, are turned on, and input data is stored in the standard SRAM cell. In the search operation, only one of these transistors, MN3 and MN4, will be activated at a time since the gates are connected to two opposite sides of the SRAM cell. If the input data does not match the values that are prestored in the SRAM cell, then MN5 will be turned on, pulling the MatchLine down to ground. If the MN5 in every cell in the word remains off, then the MatchLine will remain at a high value indicating a match. Otherwise, the MatchLine will be discharged to low. All bits in a word share a MatchLine creating a wired AND. However, some the drawbacks of the basic CAM cell design are as followed: 1) Input circuits are two complementary heavy loading bit lines. 2) The frequently switching of two complementary bit lines is the one of the major source of power consumption in the basic CAM.

#### **3.2** Ten-transistor CAM Cell [3]

Fig. 4 shows the ten-transistor CAM cell design. This cell incorporates a standard five-transistor D-latch device to store a data bit, four-transistor used for comparison, and one pull-down transistor to drive the word MatchLine.



Fig. 4. The ten-transistor CAM cell.

According to the considerations of the power consumption and the circuit reliability, the single bit line design is used in this CAM cell that not only can effectively reduce almost half of heavy loading line but also avoid the problems of two both complementary bit lines, such as the phase skew and the two heavy loading lines. In addition, the other difference between basic CAM cell and this CAM cell is that the comparison circuit is realized by the CMOS type XOR gate instead of the PTL type XOR gate to achieve the full swing.

#### 3.3 Proposed CAM Cell

In this paper, a new nine-transistor CAM cell as shown in Fig. 5 is presented that reduce the power consumption and the hardware cost.



Fig. 5. The proposed nine-transistor CAM cell.

Recalling the concept of the ten-transistor CAM cell design, the proposed CAM cell is also based on the single bit line design. The proposed CAM cell incorporates a standard five-transistor D-latch device and a comparison circuit with one pull-down transistor to drive the word MatchLine. The difference between ten-transistor CAM cell and the proposed CAM cell is that the comparison circuit composed by three-transistor replaces four-transistor. The truth table of the proposed CAM cell is shown in table 1.

Table 1 The function of proposed CAM cell. (Where  $V_{TN}$  is the threshold voltage of nMOS.)

| $BL_i$ | $Q_i$ | $C_i$             |
|--------|-------|-------------------|
| 0      | 0     | 0                 |
| 0      | 1     | 1                 |
| 1      | 0     | $V_{DD} - V_{TN}$ |
| 1      | 1     | 0                 |

In this paper, the proposed CAM is used the structure of the static pseudo nMOS CAM word circuit with the proposed CAM cell. In the pMOS pull-up chain, there may have four states,  $(0,0,1,V_{DD} - V_{TN})$ , in each gate of pMOS transistors. Therefore, the pMOS is turned on when its gate is connected to 0 or  $V_{DD} - V_{TN}$ . Consequently, the

probability of the static current is  $(3/4)^P$ . For example, if the partial bits are six then the probability of the static current is 0.178. In the ten-transistor CAM cell design, under the same structure, the probability of the static current is  $(1/2)^{P}$ . The added probability of the static current,  $((3/4)^P - (1/2)^P)$ , are caused by those gates of pMOS pull-up chain connected to the  $V_{DD} - V_{TN}$ . However, the static power dissipation of those added static current is caused by the sub-threshold current. For this reason, the effect on the total power consumption is small. Moreover, with the m words by n bits CAM size design, the proposed CAM reduces the power consumption of *m* by *n* transistors when compared to the ten-transistor CAM. Therefore, the structure that using the static pseudo nMOS CAM word circuit with the proposed CAM cell design is expected to achieve low-power feature than that of other conventional CAM cells.

### **4 Simulation Results**

The simulation results of the three CAM cells are based on the TSMC 0.25 µm CMOS process with a 2.5 V supply voltage. In the data searching operation, the searching speed depends on the difference between two compared data, because more similar data takes more searching latency. Consequently, for measuring the maximum speed of these CAM structure, all data stored in these CAM are one-bit misses. Table 2 shows the simulation results of conventional CAM cells and proposed CAM cell with different partial bits. The memory capability of these CAMs is 32 words by 32 bits and the operation frequency is 200 MHz. The simulation results show that the proposed CAM takes less power consumption than that of other designs. The power-delay products of the conventional CAM cells and the proposed CAM cell are shown in Fig. 6. The power-delay product of the proposed CAM cell is reduced over 45% than that of basic CAM cell and about 10% than that of ten-transistor CAM cell. Moreover, the hardware cost of the proposed CAM cell is reduced about 10% than that of ten-transistor CAM cell. Fig. 7 shows the power-delay product and the power-performance of the proposed CAM with 128 words by 32 bits size. The measured results indicate that this design works up to 300 MHz under 2.5 V supply voltage. The summary of the proposed CAM is shown in table 3.

Table 2The simulated power consumption at 200MHz with different partial bits.

| Power consumption (mW)         |      |      |      |      |      |      |
|--------------------------------|------|------|------|------|------|------|
| Partial<br>bits                | 0    | 1    | 2    | 3    | 4    | 5    |
| Basic<br>CAM cell              | 16.2 | 10.4 | 8.01 | 6.70 | 5.98 | 5.59 |
| Ten-Trans<br>istor cell<br>[3] | 15.5 | 9.6  | 6.82 | 5.47 | 4.83 | 4.54 |
| Proposed<br>CAM cell           | 14.8 | 9.0  | 6.35 | 4.97 | 4.26 | 3.89 |



Fig. 6. Simulated power-delay product at 2.5V with different partial bits. (The memory capability is 32 words by 32 bits.)



Fig. 7. The power-delay product and the power-performance of the proposed CAM. (The memory capability is 128 words by 32 bits.)

| Table  | 3 | Summarv    | of         | the | pro | posed | CA       | ١M | ſ |
|--------|---|------------|------------|-----|-----|-------|----------|----|---|
| I aore | - | Scalling , | <b>U</b> 1 | une |     | pobea | <u> </u> |    |   |

| ~ ~ ~ · · · · · · · · · · · · · · |                     |  |  |  |
|-----------------------------------|---------------------|--|--|--|
| Chip configuration                | 128 x 32            |  |  |  |
| Process technology                | 0.25 µm CMOS        |  |  |  |
| Supply voltage                    | 2.5V                |  |  |  |
| Speed                             | 300 MHz             |  |  |  |
| Power consumption                 | 16.38 mW @ 2.5V     |  |  |  |
| Power-delay product               | 39.8 mW-ns          |  |  |  |
| Power-performance                 | 13.33 fJ/bit/search |  |  |  |

### **5** Conclusions

A new low-power and low-cost CAM cell design for binary CAM is proposed in this paper. In the CAM word structure, the static pseudo nMOS CAM word circuit is used to avoid the drawbacks of the traditional dynamic CAM word circuit; moreover, the precomputation approach is used to reduce the static power. This paper also provides that the proposed CAM cell consumes less power consumption than that of other conventional CAM cells. The simulation results show that the proposed CAM circuit operates up to 300 MHz with the power consumption is 16.38 mW at 2.5 V supply voltage. Moreover, the power-performance metric of this circuit is 13.33 fJ/bit/search.

#### References:

- [1] T. Ikenaga, T. Ogura, "A Fully Parallel 1-Mb CAM LSI for Real-Time Pixel-Parallel Image Processing," *IEEE J. Solid-State Circuits*, vol. 35, pp. 536-544, Apr. 2000.
- [2] K. J. Lin, C. W. Wu, and S. Member, "A low-power CAM design for LZ data compression," *IEEE Trans. Computers*, vol. 49, pp. 1139–1145, Oct. 2000.
- [3] C. S. Lin, K. H. Chen, B. D. Liu, "Low-Power and Low-Voltage Fully Parallel Content-Addressable Memory," was presented at ISCAS 2003 (Bangkok, Thailand.)
- [4] F. Shafai, K. J. Schultz, G. F. R. Gibson, A. G. Bluschke, and D. E. Somppi, "Fully parallel 30-MHz, 2.5Mb CAM," *IEEE J. Solid-State Circuits*, vol. 33, no. 11, pp. 1690-1696, Nov. 1998.
- [5] H. Miyatake, M. Tanaka, and Y. Mori, "A design for high-speed low-power CMOS fully parallel content-addressable memory macros," *IEEE J. Solid-State Circuits*, vol. 36, no. 6, pp. 956-968, June 2001.
- [6] G. Thirugnanam, N. Vijaykrishnan, and M. J. Irwin, "A novel low power CAM designs," in *Proc. IEEE ASIC/SOC Conf.*, Sept. 2001, pp. 198-202.
- [7] C. A. Zukowski and S. Y. Wang, "Use of selective precharge for low-power on the match lines of content-addressable memories," in *Proc. IEEE Memory Technology, Design and Testing Int. Workshop,* Aug. 1997, pp. 64–68.
- [8] N. H. E. Weste and K. Eshraghian, Principles of CMOS VLSI Design: A Systems Perspective. Reading MA: Addison-Wesley, 2nd edition, 1993.