The phenomenal growth of network-based electronic commerce is dependent on the efficient exchange of information between various parts of a network. Use of widely accepted protocols to exchange information makes the task of application developers simpler while the developers constantly improve operating systems and services to support the protocols in use. The growth of electronic commerce is further aided by improved hardware making possible larger network bandwidths approaching the memory bandwidth available on a typical desktop workstation. Consequently, efficient handling of packets is assuming increasing importance for fully utilizing the network bandwidth.
Efficient handling of packets requires that incoming packets be classified to determine how each one is to be preferably processed. This classification reflects a type information associated with packets for identifying the flow or path to which the packet belongs. Moreover, in view of the large number of packets handled in a network and the comparable network and CPU/memory bandwidths, it is preferable to minimize the memory access operations undertaken while classifying or processing a packet. This aim becomes even more significant in view of the fact that memory access operations are much slower than processor speeds resulting in the processor idling while a memory access operation is being completed. Therefore, improved implementations for the protocols specifying packet structure should reflect the aforementioned considerations.
Packets typically conform to a handful of protocols. These protocols offer naming schemes for nodes and interfaces in the network, error-free delivery of packets, encryption and authentication of packets and the like. Some of the common protocols are described hereinafter along with new developments to expand the protocols to meet anticipated needs in the near future.
The backbone of the biggest network, the Internet is the TCP/IP suite of protocols comprising the Transport Control Protocol (TCP) and the Interface Protocol (IP) suite of modules providing various services in the network. IP provides a mechanism for addressing packets to network nodes while TCP, operating at a higher level in the network stack, ensures error free delivery of packets. In addition, the Universal Datagram Protocol (UDP), included in the TCP/IP package, enables sending and receiving data-packets without the overhead of guarantee of service required by TCP alone.
The IP protocol version 4 assigns a 32-bit address to a machine on a network. Revisions to IP version 4 to meet the needs of a larger network resulted in IP version 6 specification (hereinafter “IPv6”) that provides 128-bit addresses for interfaces and sets of interfaces. Further details on IPv6 are available in the RFC 2373 document that is herein incorporated by reference in its entirety.
Network addresses enable the network stack to receive packets targeted to a specific address and forward or deliver the packet accordingly. Network addresses have additional properties such as the “type” information corresponding to a particular network address and its processing. Such type information includes details such as whether the packet is local, broadcast, multicast, remote, remote broadcast, remote multicast, subnet broadcast and the like. The precise definition of the type is implementation specific so that different network stack vendors employ different type definitions.
Storing the type information with its corresponding IP version 4 compliant 32-bit network address requires more than one machine word on a 32-bit machine. Since a network address uses at least one machine word of 32-bits for IP version 4 and higher, the type information has to be stored in another machine word. Type information requires only a few bits—typically less than four bits of a machine word—but is assigned at least one machine word due to the addressing convention used in modem computers. The two machine words encoding the address and its corresponding type should be read as one atomic unit so that intervening write operations do not result in subtle errors due to mismatches between the network address and its corresponding type.
As discussed hereinafter, the various choices for network address formats are relevant to the manner in which computing environments store, recall and use addresses along with their associated type information. Computers have a smallest unit of memory termed a machine word that can be directly addressed. The contents of a machine word are interpreted in a context dependent manner. Thus, whether a particular machine word represents an address pointing to another machine word or a network node's address or an integer depends on the particular context.
Computer operations, such as a read or write from a memory location, are performed on a machine word rather than a single bit. In a 32-bit machine the smallest unit that can be directly addressed is a 32-bits machine word. A read operation on such a 32-bit word results in all 32 bits being copied to the processor's register in one operation. In other words, the read operation is an atomic operation.
Reading two machine words requires execution of two read operations. It is possible that following the first read operation, but before the second read operation by a first thread, another thread, process or processor may overwrite the memory contents to be read during the second read. This problem becomes more acute in multiprocessor systems.
The problem is not limited to multiprocessor systems and includes multitasking systems. For instance, in multitasking computing systems, the operating system allocates limited time slices to each thread on a processor. If the time slice allocated to the first thread expires after the first read but before the second read operation then the next thread executes several instructions in its time slice. These instructions can include modifications to the location to be read in the second read operation by the first thread unless the first thread requests the operating system to prevent such access by “locking” the memory. Implementing locks does not scale well with increasing number of processors in a computing environment resulting in a significant overhead.
If the two read operations by the first thread are close together then the probability of an intervening write operation at the location to be read by the second read operation is small and the second read operation is called a “volatile” read. The volatile read operation can be made more certain by placing a “lock” on the memory location to be read by the second read operation to prevent any other thread from accessing the memory location. However, the overhead for implementing such a lock adversely affects performance.
As mentioned earlier, reading information from a memory location remote from the processor chip is significantly slower than the speed of modem processors. Thus, reading two machine words in a back to back manner may result in the processor idling for a few cycles in the intervening period between the read operations for machine words retrieved separately from the remote memory. On the other hand, not storing the type information in a cache requires deducing the type information when needed with several read operations that add to the overhead. In view of the large number of network addresses handled by the network stack small efficiencies, such as using a cache to get the type information corresponding to an address of interest, at the level of a single network address result in significant savings.
It should be noted that the term cache is used to denote a variety of stores. There are fast hardware cache memories such as the L1 cache and the L2 cache, both associated with the processor and termed CPU caches. These caches represent expensive and fast memories that help bridge the gap between the processor and the basic system memory. In contrast to the CPU caches there are caches implemented as data structures to provide frequently used information without the need to repeat lengthy computations. Type information is an example of information that can be deduced from the context and the network address and may be cached. Accordingly, further improvements in managing a cache of addresses and corresponding type information are needed to make the caching of type information more effective.