A packet-processing device usually needs to buffer the packets into a packet memory or PM while the device processes them. The size of some packets (for example Ethernet packets) is not known in advance, so the device needs to start storing the packet into the packet buffer without knowing how large the packet is. Moreover, packets arrive at the device in an interleaved fashion, so the device is simultaneously storing several incoming packets into the packet buffer.
The state of the art solution to store the packet in the device's memory is to assign multiple chunks (called pages) of packet memory to each packet, rather than a single big chunk. With this scheme, the packet is not stored consecutively in the packet memory, rather the pages of the packet are scattered throughout the packet memory. Therefore, a memory manager or buffer manager (BM) needs to maintain a linked list of all the pages that a particular packet uses in the packet memory. This linked list is traversed when the packet is read out of the packet memory for transmission. Each page has associated state that contains some information about the page, mainly:                A pointer to the next page;        Whether the page contains the start of packet (SOP) and/or the end of packet (EOP) attribute of the packet, the end of header (EOH), and possibly other attributes;        The valid bytes stored in the page (usually only relevant for the last page, where not all the page contains valid data)        . . .        
The state of all the pages in the packet processor device is maintained by the memory manager. A packet has an associated descriptor that in its basic form is the pointer to the first page. With this initial pointer, all the pages used by the packet can be retrieved in the same order they were used by traversing the link list built from the next page pointers in the different page states. The memory manager is also responsible for providing the available pages (free pages) to the engines that receive the packet data and store this data into the packet memory, and eventually reclaim the used pages once the packets using those pages have been transmitted out.
In packet processing devices such as switches, packets can broadly be classified into two categories: unicast and multicast/broadcast. Unicast packets are packets that are sent to a single egress port while multicast/broadcast packets are sent to several ports. In switches with packet memories implemented with page link lists, the same page can be used by different packets. In the context of this disclosure, this page is said to have a reference count (ref cnt) equal to the number of packets that use the page. Also, pages are classified into header or payload pages, depending on whether a page has been used to store part of or the entire header of the packet, or none, part of or the entire payload of the packet. Note that the entire packet may have been stored in one or more header pages, in which case no payload pages exist. In the context of this disclosure, the header size of a packet is configurable and comprises the initial portion of the packet that is of interest to the packet processing device to perform the processing (modifying the packet and determining the egress port).
FIGS. 1A-1F show examples of page link lists for packets that have dedicated pages or that share pages with other packets. The number 100 inside a page 102 indicates the number of references to that page, i.e. how many packets use that page. In FIG. 1A a unicast packet has header and payload pages. All pages have a single reference count.
In FIG. 1B two packets share the same payload. Therefore, all of the pages of the payload have 2 references. The header of both packets is different in size and/or content. Ph B's header fits in a single page.
In FIG. 1C 10 different packets all have the same content; therefore, all pages have the same reference count of 10. These are different packets that may be sent to the same or different egress ports.
FIG. 1D is a mixture of the examples of FIGS. 1B and 1C. FIG. 1E is a unicast packet with no payload. FIG. 1F corresponds to the example of FIG. 1C, but there is no payload.
In the nomenclature of this disclosure, unicast packets are defined as packets that have not been created by partially or totally using other packets. Thus, FIGS. 1A and 1E illustrate unicast packets. A packet that is received at an ingress port, gets processed (e.g., its header is modified), and is sent to a single egress port is considered a unicast packet. In this disclosure, it is assumed that if the header or the payload is shared, all the pages composing the header or payload are shared and therefore all of them have the same reference count.
In a state of the art packet processing device, incoming packets are stored into the packet memory (PM) by a specialized direct-access memory (DMA) block (henceforth named Receive DMA or RDMA) and outgoing packets are retrieved from the PM by another DMA block (henceforth named Transmit DMA or TDMA).
FIG. 2 depicts the basic interaction between RDMA, PM and BM. In particular, write page clients 200 are RDMA clients that write data to packet memory 202. Read page clients 204 are TDMA clients that read data from packet memory 202.
The Figure also shows another main component of such a device broadly labeled as the Control block 208. The main functions of the Control block are: perform any necessary modification of the header of the packet, store the header of the packet to the PM, decide to which port or ports to send the packet, and perform any traffic management. For the purpose of this disclosure, the Control block provides the packet descriptor to the TDMA, which is then responsible from that point on to read the packet data from packet memory and send it out to the egress port.
The sequence of events for a given packet is the following:                RDMA or write page client 200 receives the packet data from the ingress port and stores its payload into the PM 202 using the pages that the memory manager or buffer manager BM 206 provides;        RDMA allocates the pages for the header (alternatively, Control could allocate these pages from the BM 206);        RDMA generates a packet descriptor containing, among other information, the pointer of the first header page. In one embodiment, a packet descriptor corresponds to a single packet;        Control 208 processes the header of the packet and stores it into the PM 202 using the pages that the RDMA allocated;        For each page used, Control 208 sends the proper reference count update to the BM 206 since Control has processed the packet and knows whether the header and/or payload has been replicated and multiple packets now share the same header and/or payload; this is shown in FIG. 2 as “RefCnt Update for All Pages”;        Read page clients of the TDMA 204 are ready to send on a particular egress port, so TDMA requests from the Control block a descriptor for a packet to be sent to that egress port;        Control sends the descriptor if it has one;        TDMA 204 reads the page states from the BM 206 and, for each page state, performs one or more requests to the PM 202 to obtain the packet data;        TDMA 204 sends the packet data to the egress port;        TDMA 204 sends a notification to the BM 206 that all the pages used by the packet can be potentially reclaimed; and        BM 206 performs the reclaiming of pages, which are then allocated to the write clients of the RDMA 200.        
Note that in this baseline approach the reference count storage (RefCnt Storage) is accessed every time a page is used by an incoming packet. This access can be a write or a read-modify-write (if the update is an increment/decrement of the previously stored value).
Similarly, when a packet is transmitted, the reference counts for all the pages involved in the packet need to be read from the RefCnt Storage, and if a page's reference count is 0, that page can be reclaimed and reused for another packet. If the reference count is not 0, then the reference count needs to be decremented and written back into the RefCnt Storage.
Therefore, this baseline implementation of the reference counts of the pages in a page link list based packet buffer requires a high access rate to the reference count storage. Since the number of pages is usually large in high performance packet processing devices, the reference count storage can be costly in terms of area (due to the amount of access ports to the storage) and/or power consumption. Consequently, it is desirable to devise techniques to reduce the cost of this approach.