1. Technical Field
This invention relates to an operating system kernel configured with independent isolated environments that appear as virtual machines within the operating system. More specifically, the invention relates to modifying the kernel to support sharing of resource between the isolated environments.
2. Description of the Prior Art
An operating system is a collection of system programs that control the overall operation of a computer system. The operating system may include an operating system container, on a physical computer, logical or physical partition, or a virtual machine hypervisor. In an operating system, containers effectively partition resources managed by a single operating system into isolated groups to balance conflicting demands on resource usage between the isolated groups. Containers can run instructions native to the core CPU without any special interpretation mechanisms. By providing a way to create and enter containers, an operating system gives applications the illusion of running on a separate machine while at the same time sharing many of the underlying resources. For example, the page cache of common files may effectively be shared because all containers use the same kernel and, depending on the container configuration, frequent the same library. This sharing can often extend to other files in directories that do not need to be written to. The savings realized by sharing these resources, while also providing isolation, mean that containers have significantly lower overhead than true virtualization.
A Container is built around the concept of a namespace, which is a feature of the kernel that allows different processes to have different views of the file system mounts, network, process, inter-process communication (shared memory, message queues, pipes) or other subsystems. A subsystem object identifier is present and is searchable in a particular instance of the namespace, thereby the same identifier may be used in another namespace without conflict. In effect a namespace allows multiple instances of the same subsystem identifier to exist on the same operating system. Each type of subsystem identifier therefore defines a separate namespace type. This enables a task to be associated with specific namespaces and thereby confine its access to specific objects while other tasks may similarly exist in other namespaces. The namespace encapsulates kernel variables to ensure that they will not interfere with commands and variables of other namespaces. In practicality, namespaces are dynamic in that you can add and delete objects at any time. The namespace has a hierarchy, including parent namespace, children namespace, etc. The separation and isolation of namespaces prevents sharing of resources. In other words, in order for processes to share resources, they must be in the same namespace. At the same time, the isolation of the namespace prevents efficient communication. The processes must communicate over the network to break the isolation.
There is a parallel jobs scheduling system in the art that allows users to run more jobs in less time by matching each job's processing needs and priority within the available resources, thereby maximizing resource utilization. However, for an operating system employing containers, a task within a container is bounding within the associated namespace. FIG. 1 is a flow chart (100) illustrating a prior art process for establishing a communication connection across containers. As shown, a first container is created with a first isolated namespace (102), and a first socket, in listening mode to accept connections, is created in that namespace (104). A socket is a software object that connects an application to a network protocol. For example, in an UNIX operating system environment, a program can send and receive TCP/IP messages by opening a socket and reading and writing data to and from the socket. Following the creation of the socket at step (104), a second container is created with a second isolated namespace (106). A second socket is created in the second namespace (108) and the second container requests a connection to the first socket (110) created at step (104). As shown, a socket exists on both sides of the connection, with one socket configured in a LISTEN mode to accept a connection, and the second socket to solicit the connection. However, the operating system blocks the second container from discovering the previously created socket because this socket is in the namespace of the first container. More specifically, there is a failure (112) of the connection request of the second container due to the separation of the network namespaces. Accordingly, the connection request of the second container fails.
As demonstrated in FIG. 1, different containers having different isolated namespaces cannot share a socket. More specifically, there is no cross-container communication support in the prior art, which mitigates efficiency of resource utilization. Accordingly, there is a need for a mechanism that supports parallel jobs scheduling that enables cross container communication.