# Design of a New Chip Architecture for a Home Gateway

Kwang-Soon Choi, Kwang-Mo Jung, Myung-Hyun Yoon High Speed Network Research Center Korea Electronics Technology Institute(KETI) 270-2, Seohyun-Dong, Bundang-Gu, Seongnam-Si, Kyunggi-Do, 463-771 Korea

Abstract – As Internet is becoming popular to everyone recently, demands for higher-quality services such as VOD and home networking have been increasing. Especially, home networking system can interconnect and control home appliances which use different protocols via Internet. This means that a common protocol to communicate with each other and a new system architecture to implement the common protocol are needed. In this paper, we propose a common protocol and a novel chip architecture with a memory management scheme for a home gateway system.

Key-Words: Common Protocol, Packet Conversion, Home Gateway, Home Network, Shared Memory

# **1** Introduction

The Internet is so popularized to everyone in everywhere recently that the demands for accessing home appliances via Internet and communicating between them are increasing. As a result, new researches and efforts about home gateway (HG) and residential gateway (RG) has been made. In these system, the most important thing is the interconnection between WAN and LAN at home. xDSL and cable modem can be WAN interfaces, and wireless LAN, IEEE1394, LonTalk, bluetooth, HomePNA and HomeRF can be LAN interfaces in the home. IEEE1394 is the protocol for the control of audio and video devices and the real-time data transfer between them [1]. Bluetooth is for the access of home appliances and low-rate data transfer between them like IEEE1394 [2]. Unlikely, LonTalk is only for the control of home appliances [3].

All these protocols are targeted to their own special application areas such as high- or low-speed data transfer and control data transfer. Thus, differences and characteristics of these protocols should be considered in the design of HG or RG system to communicate with each other without failure.

In this paper, we propose a new chip architecture for a HG, a common protocol (CP) and a memory management scheme for packet-switching. In section 2, a recommended system architecture for a HG is described. In section 3, a new chip architecture and a CP for packet conversion are described. In section 4, a memory management scheme and a common bus architecture for parket-switching are described.

# 2 Architecture for a HG

Fig. 1 shows a general architecture for a HG which is proposed by ISO/IEC JTC1/SC25 WG1 [4]. This architecture consists of three main parts - Gateway Internal Protocol (GIP), WAN Gateway Interface (WGI) and LAN Gateway Interface (LGI). Both WGI and LGI convert incoming packets of a specific protocol to CP packets and then GIP switches CP packets which flow from common interface to target common interfaces which convert CP packets to its corresponding protocol packets. Thus, these consecutive operations make it possible to communicate between interfaces using different protocol.



Fig. 1. General Architecture for a HG

In designing this system, the important things to

consider are followings:

- design of an efficient CP
- algorithm for converting to the CP
- management of traffic with a different speed
- algorithm for QoS
- algorithm for memory management
- algorithm for security

# **3** New Chip Architecture

### 3.1 Chip Architecture

Fig. 2 shows the proposed chip architecture. In this chip, incoming packets pass through two packet converters which are Anything-to-CP converter and CP-to-Anything converter. Each of them consists of a Header Converter and a Payload Converter. The Header Converter reads Address Table with its original address and gets new address which was already set when this chip started.

The Payload Converter which is similar to a RISC processor operated by conversion codes, which was already micro-coded in a non-volatile memory such as ROM or PLA, converts incoming payload to CP or target protocol payload. It processes an incoming packet at the layer 2-4 level simply and at the layer 5-7 level for more complex operations such as packet conversion [5]. And it has its own local memory or register files. The exact conversion algorithm is still being researched.



Fig. 2. Chip Architecture

In this system, the memory management scheme is important for a perfect cycle from reception to transmission of a packet. Considerations related to memory management are followings:

- position of buffers (internal or external)
- depth of buffers
- method for fast transmission of a packet between two packet-processing blocks

Each Interface block interfaces external PHY/LINK chips. When processing incoming packets, it buffers eight serial bits and sends one-byte to an Input Buffer. When processing outgoing packets, it buffers one-byte and sends it to external PHY/LINK chips with the form of serial bit stream. Input/Output Buffers can buffer a whole incoming or outgoing packet which has maximum payload size. And a Traffic Management makes a Buffer Controller control a packet by the traffic management algorithm. It passes or discards a buffered packet. A Segmentation block segments an incoming CP packet into several 256-byte packets, which helps a Switch block to access shared memories. Using a fixed-size packet offers not only more easier and faster way, but also more higher memory-utilization way to a QoS/Priority Cont-roller and a Scheduler. A Reassembly block assembles incoming switched packets and sends one assembled CP packet to a CP-to-Anything Converter according to the information in the CP header.

The architecture of a Switch Block and a shared memory is followed in the next section.

# 3.2 Common Protocol Packet

Fig. 3 shows the CP packet format, which can be can be modified according to the variations of chip architecture, conversion algorithms, traffic management, QoS, security and so on. Since we do not wholly focus on to the CP packet in this paper, the basic CP packet format is described.



Fig. 3. Common Protocol Packet Format

The packet size of each protocol that this chip supports is considered in Table 1. The payload size of the CP packet is important. It would be better for the CP packet to be able to contain an average size of all packets for the performance of the chip.

Table 1. Maximum Payload Size

| Protocol  | Max. Payload Size (bytes)                                         |
|-----------|-------------------------------------------------------------------|
| IP        | 1,500 (header included)                                           |
| IEEE 1394 | 16,384 @ 3.2Gbps (asynchronous)<br>32,768 @ 3.2Gbps (isochronous) |
| LonTalk   | 229 (APDU)                                                        |
| Bluetooth | 1691 (B-PAN)                                                      |

The CP packet consists of header of 16 bytes and maximum payload of 2,048 bytes. This can contain not only any datagram of IP, LonTalk and bluetooth but also an asynchronous packet of 2,048 bytes at 400Mbps rate and an isochronous packet of 2,048 bytes at 200Mbps rate in the case of IEEE 1394. In addition, Sequence and Last fields in the CP packet make it possible to carry a long packet over 2,048 bytes by fragmentation.

## 4 Memory Management Scheme

This proposed architecture requires two external memories: a memory for storing of CP packet and a memory for linked list, QoS and priority. Both of them share an address bus and an data bus, and can be accessed by /CSn, /WEn, and /OEn (n = 0 or 1).

#### 4.1 CP Packet Memory

Fig. 4 shows a CP packet memory. N Segmented CP packets can be stored in a CP packet memory, and Scheduler makes Reassembly block read them according to the adaptive-QoS algorithm which requires external QoS and Priority buffer. These QoS/Priority Buffers and linked lists buffer which index the CP packet memory use the same external memory partitioned into three buffers.



Fig. 4. CP Packet Memory

### 4.2 Linked List Buffer

Linked List Buffer contains N linked lists. Each of them is composed of three fields: Previous Linked List Address (PLA), Packet Memory Address (PMA) and Next Linked List Address (NLA). Fig. 5 shows these fields. Each consists of four-byte address field, which uses 32-bit addressing mode. These three fields indicate an address of a previous linked list, an address of a segmented CP packet and an address of a next linked list.

| 4 bytes               | 4 bytes               | 4 bytes           |
|-----------------------|-----------------------|-------------------|
| Previous List Address | Packet Memory Address | Next List Address |
| (PLA)                 | (PMA)                 | (NLA)             |

#### Fig. 5. The Format of Linked Lists

These linked lists can be divided into two lists: Free-Space Linked Lists and Used-Space Linked List. And two header-pointers and two tail-pointers are used to point the header and the tail of each list. All lists of Free-Space Linked List point to the free space of CP packet memory. And all lists of Used-Space Linked List point to the used space of CP packet memory.

Fig. 6 shows a CP packet memory and linked list buffers.



Fig. 6. The Format of Linked Lists

| ▲ 12 bytes |           |      |         |
|------------|-----------|------|---------|
| N/A        | 0         | 12   | 1       |
| 0          | 256       | 12*2 |         |
| 12         | 256*2     | 12*3 |         |
|            | •         | •    |         |
|            | •         | •    | Z       |
|            |           |      | N Lists |
|            |           |      | sts     |
|            |           |      |         |
|            |           |      |         |
|            |           | •    |         |
|            |           |      |         |
| 12*(N-2)   | 256*(N-1) | N/A  |         |

Fig. 7. Initial State of Linked Lists

In order to utilize these linked lists, they should be initialized as shown in Fig. 7. Initially all lists should be linked as Free-Space Linked Lists.

In case of storing a packet to CP packet memory, the header of Free-Space Linked Lists should be read before storing a segmented CP packet to a CP packet memory. This header indicates the location where incoming packet is stored. If a packet is stored, the first list of Free-Space Linked Lists should be deleted and then attached to the end of Used-Space Linked Lists as shown in Fig. 8.



Fig. 8. Updating Linked Lists (After Storing a Segmented CP Packet)

In case of reading a packet from a CP packet memory, the corresponding linked list of Used-Space Linked Lists should be deleted and then attached to the end of Free-Space Linked List as shown in Fig. 9.



Fig. 9. Updating Linked Lists (After Reading a Segmented CP Packet)

#### 4.3 QoS and Priority Buffers

This chip uses adaptive-QoS algorithm which is especially proposed for HG system. Since this paper is no focused on the QoS algorithm, a brief introduction will be followed.

All incoming packets with different protocol format are classified into three classes: class 1 (control data), class 2 (multimedia data) and class 3 (Internet data). These classified packets are not switched directly in the switch block, but reordered in the Priority Buffer according to the line state. Fig. 10 shows a QoS buffer, a Priority Buffer and a format of each entry.



Fig. 10. QoS/Priority Buffers and the Format of Each Entry

This format of QoS/Priority entry is similar to the format of linked lists. Once a segmented CP packet is stored, the corresponding linked list is updated. And then the information and an address of this linked list are stored to a QoS Buffer according to the QoS field in CP packet header and reordered in a Priority Buffer according to the priority which is decided by the line state.

A Scheduler reads a PMA field of one entry in Priority Buffer and finds the destination address in a CP packet header. And Bus Controller reads the Bus Status register and decides if it sends a /BGn towards the Reassembly block of destination node or not.

#### 4.4 Memory Controller

Fig. 11 shows the timing diagrams for accessing two external SRAMs. Each access requires 780ns and consists of four steps: storing (or reading) a segmented CP packet, updating linked lists, updating QoS Buffer and updating Priority Buffer.

A Memory Controller interfaces two external SRAMs via 32-bit address bus, 32-bit data bus and control signals: chip select (/CSn), write enable (/WEn), output enable (/OEn) and addresses according to the contents of

internal ring counter. n indicates a CP packet memory (n=0) or the other one(n=1).



Fig. 11. Timing Diagrams for Accessing External SRAMs

### 4.5 Common Bus and Bus Controller

A Common Bus provides a common path between SARs and a Switch Block. Converted packets flows from a SAR to a Switch Block through this bus and then returned to a SAR of a destination node. This bus consists of 32-bit data bus and control signals such as /BRn and /BGn. Fig. 12 shows the Common Bus Controller and Memory Controller.



Fig. 12. Common Bus Controller and Memory Controller

#### 4.6 Performance Estimation

To estimate chip performance, we assumed that there income packets with maximum payload size at the full line-speed on each interface and RISC CPUs in CP-to-Anything (or Anything-to-CP) Converters can convert payload in 512 machine cycles. And internal chip architecture can be divided into four logical blocks: interface, Anything-to-CP (or CP-to-Anything) converter, SAR and Switch Block. Table 2 shows the packet-processing performance of each internal logical block.

|                          | Interface<br>(P1)                 | xx-to-CP<br>(CP-to-xx)<br>(P2)                                   | SAR<br>(P3)                                                     | Switch<br>(P4)                                                             |
|--------------------------|-----------------------------------|------------------------------------------------------------------|-----------------------------------------------------------------|----------------------------------------------------------------------------|
| IEEE<br>1394             | 400 Mbps<br>= <u>6.400 pkts/s</u> | Using 4 MHz<br>RISC CPU<br>512 cycles/pkt<br><u>7,812 pkts/s</u> | 1 pkt<br>= 32 Seg. Pkts<br>4 us/Seg. Pkt<br><u>7,812 pkts/s</u> | Total # of<br>incoming<br>segmented<br>packets<br>= 382,788<br>Seg. Pkts/s |
| IP<br>(LAN<br>or<br>WAN) | 100 Mbps<br>= <u>8,738 pkts/s</u> | Using 5 Mhz<br>RISC CPU<br>512 cycles/pkt<br><u>9,765 pkts/s</u> | 1 pkt<br>= 6 Seg. Pkts<br>17 us/Seg. Pkt<br><u>9,765 pkts/s</u> |                                                                            |
| Blue-<br>tooth           | 2 Mbps<br>= <u>155 pkts/s</u>     | Using 1 Mhz<br>RISC CPU<br>512 cycles/pkt<br><u>1,953 pkts/s</u> | 1 pkt<br>= 7 Seg. Pkts<br>73 us/Seg. Pkt<br><u>1,953 pkts/s</u> | Switch<br>Performance<br>= 1.56<br>us/Seg. Pkt                             |
| Lon-<br>Talk             | 2 Mbps<br>= <u>155 pkts/s</u>     | Using 1 Mhz<br>RISC CPU<br>512 cycles/pkt<br><u>1,953 pkts/s</u> | 1 pkt<br>= 1 Seg. Pkts<br>512us/Seg.Pkt<br><u>1,953 pkts/s</u>  | (write&read)<br>= <u>641,025</u><br>Seg. Pkts/s                            |

Table 2. Performance(P) of Internal Logical Block

In order to process incoming packets without failure, it is important to estimate the packet-processing performance (Pn) of each internal block and to make sure that P1 • P2 • P3 • P4.

# 5 Conclusion

In this paper, we have proposed a new chip architecture, a CP, a shared memory architecture for packet-switching and a common bus architecture for a HG system.

This chip uses IP for WAN interface, and IP, IEEE 1394, LonTalk and Bluetooth for LAN interfaces. And It consists of two packet converters which have a header converter and a payload converter. These converters have the architecture of RISC microprocessor operated by the micro-coded conversion algorithms. The CP packet format has been designed but not fixed yet. And the management of shared memory by using linked lists and the adaptive-QoS is designed.

We conclude that the packet-conversion time, switching time and the packet forwarding time can meet the conditions required on this proposed chip architecture. And the complexity and the cost of the chip can be reduced by using external SRAMs for switching packets.

And not only the algorithms for packet conversion, traffic management and security but also the common protocol and the implementation of packet converters are being studied.

References:

- [1] D. Anderson, FireWire System Architecture, Addison Wesley, 1999
- [2] J. Bray and C. Sturman, Bluetooth Connect Without Cables, Prentice Hall, 2001
- [3] Echelon Corp., "LonTalk Protocol Specification", Version 3.0
- [4] ISO/IEC JTC 1/SC 25/WG 1, CD1 15045-01, "Information technology – Interconnection of Information technology equipment – Architecture for HomeGate, the residential gateway(AHRG)", May 1999.
- [5] L. Geppert, "The New Chips on the Block[Network Processors]", IEEE Spectrum, January 2001
- [6] Dou, S. J. Jiang and K. C. Leu, "A Novel CAM/RAM Based Buffer Manager for Next Generation IP routers", in Proc. of the First IEEE International Workshop on Electronic Design, Test and Applications, 2002
- [7] URL : http://www.webchiponline.com
- [8] The Open Services Gateway Initiative "OSGI Service Platform", October 2001
- [9] Noboru Endo, Takahiko Kozaki, Toshiya Ohuchi, Hiroshi Kuwahara, Shinobu Gohara, "Shared Buffer Memory Switch for an ATM Exchange", IEEE Transactions on Communications, Vol. 41, No. 1, January 1993