1. Field of the Invention
The present invention relates to network devices and, more particularly, to a method and apparatus for enabling multiple threads and processes to share a stack on a network device.
2. Description of the Related Art
Data communication networks may include various computers, servers, nodes, routers, switches, hubs, proxies, and other devices coupled to and configured to pass data to one another. These devices will be referred to herein as “network devices.” Data is communicated through the data communication network by passing data packets (or cells, frames, or segments) between the network devices by utilizing one or more communication links. A particular packet may be handled by multiple network devices and cross multiple communication links as it travels between its source and its destination over the network.
A network device, like a computer, has a basic set of instructions collectively referred to as the operating system. This operating system provides a number of features to allow application programs to share the resources of the network device. Applications running on a network device will share access to the CPU. Information associated with the application is stored in three areas of memory: the code space, the data space, and the stack space. The code space includes the instructions that make up the process or thread as loaded into memory by the operating system. The data space is the area of memory set aside for the process or thread for temporary storage during execution. The stack space is the portion of the memory that stores the state of execution of the program in the event that execution is interrupted. As used herein, a “process” does not normally share data with any other process, whereas a “thread” may share data with a set of related threads.
To enable multiple processes/threads to execute simultaneously on a single CPU, the CPU will allocate a certain percentage of the total available CPU time to each of the various competing processes/threads. The job of assigning time to application programs is called scheduling. There are numerous method for scheduling programs, the simplest of which is a round robin approach in which the total CPU time is divided equally among the number of contending applications. Other more elaborate schemes involve the use of priority schemes so that a number, or priority level, is assigned to applications to enable certain applications to obtain increased amounts of CPU time.
The operating system implements the scheduling schema by interrupting an application that is currently running, storing the state of the current application in a dedicated area of the stack, retrieving the state of the next application from a separate area of the stack, and restoring the retrieved state so that the new application can start executing where it last left off. The process of interrupting the current process and restoring the new process will be referred to herein as a context switch.
The stack is used to store program register values at the point of the function call and also to provide temporary storage to these functions. The stack space typically grows as a process or thread recursively calls deeper and deeper into the program. Temporary, incomplete results from one level are kept on hold in the stack while sub-results are computed and returned. In order for the CPU to perform a context switch at an arbitrary point during execution of the process, the operating system must ensure that the state of the interrupted process is able to be preserved at every arbitrary point. This requires the CPU or process to reserve a worst-case stack. In the context of a network device, each process may require from 10 Kbytes to many Mbytes of stack space.
A network device may, at any given point, have dozens or hundreds of processes enabled, each of which is contending for access to the CPU. Since each process will allocate a worst case stack, a system executing thousands of processes/threads may require between several hundred of megabytes and several gigabytes of stack space. Thus, availability of physical memory may start to limit the number of processes/threads that can be run on the network device.
Network devices, such as edge routers and other routers running on the network, are increasingly being required to run more and more processes to provide enhanced services on the network. For example, to increase network security, it is desirable to isolate networks from each other. Complete isolation would ensure complete privacy, but also requires each network to have its own network components such as routers and dedicated lines. As it is economically unfeasible for most users to build an entire network, the concept of virtual routing has been developed. Specifically, instead of providing each network user with a separate network, a virtual router (VR) is provisioned in the network device serving that user. The virtual router, since it does not share data, code or stack space with other virtual routers, is virtually isolated even though the actual physical network device may be used by many other networks and users. A Virtual Private Network (VPN), formed by encapsulating and/or encrypting transmissions passed through the virtual router may be provisioned through the virtual router to provide a private link through an otherwise public network.
As VPNs and virtual routers become more popular, Internet access routers and other edge network devices are being required to be capable of hosting large numbers of virtual routers. Likewise, routers within the network may be required to handle multiple flows, each of which may be handled by a separate virtual router for redundancy reasons or to facilitate setup of virtual channels or other logical constructs through the network. Accordingly, a network device may need to run hundreds or thousands of processes or threads, each of which requires dedicated stack space. As the number of processes increases, the amount of physical memory required to accommodate the stack allocations of these processes may become excessive.
One attempt to address this issue was to use a common process and handle all instances (via events) in the same process. Since events are able to share stack space, the amount of stack space required to implement the events is much lower than the amount required to implement the same number of processes. One problem with this is that, since events share memory space, a virtual router, if instantiated as a set of events within a common process, is not totally isolated from other virtual routers on the network device. Thus, this solution compromises security. Another problem with this proposed solution is that an incorrect memory access in any one of the events may cause the process to terminate, thus terminating all other events as well. This interdependency enables a minor failure to cause the network device to interrupt service to many end users by effectively requiring the re-instantiation of multiple virtual routers and, potentially, the VPNs configured through them.
Another attempt to reduce memory requirements is to construct the applications such that two or more processes can share stack space. While this is possible in certain circumstances, it is difficult to do this because it is necessary to carefully integrate event loops from both applications into a single loop which selects and dispatches the union of all events in the system. This labor intensive exercise, while potentially feasible when a small number of processes are involved, becomes prohibitively complex as the number of processes increases.