A conventional computer system or other form of processing system can include multiple virtual address domains. A “domain” is defined herein as a protected address space. By “protected”, what is meant is that unauthorized writes to the address space by any source other than the entity that owns the address space are not allowed. Every domain is owned by a separate processing entity of some form. Such a processing entity can be, for example, a virtual machine (VM) in a virtualization environment, or a process or thread in a traditional operating system context.
Efficient communication between domains, or inter-domain communication (IDC), is an important feature in a processing system in which domains cooperate to create a cohesive, high throughput, I/O sensitive server application. An example of such an application is a network storage server. The partitioning of the system's functionality into domains might be done for the purpose of fault isolation between components, for example. In this scenario, by design, it is likely that the domains communicate extensively with each other. Shortcomings in IDC performance would therefore tend to result in poor performance of the overall system.
Current IDC implementations are usually based on some form of shared-memory scheme. Shared memory mechanisms are an advantageous way to implement IDC, since they need not involve creating extra copies of the data being communicated and can be implemented with low overhead in the critical path of data motion. The protocol used to implement the communication usually involves exchange of messages containing pointers to the shared region of memory coupled with a signaling mechanism. Since the messages can contain pointers to the shared data, the bulk of the data transfer can be implemented in a zero-copy fashion; thereby improving performance. The actual exchange of messages can be implemented using some form asynchronous communication utilizing shared producer-consumer queues between the domains.
Shared memory mechanisms entail importing and exporting of address spaces between different domains. Each domain typically has a virtual-to-physical address translation table hierarchy (or simply “translation table” herein), which the domain uses to translate between virtual addresses and physical addresses. The number of levels in the translation table is an architecture-specific value that depends on the addressable range (e.g., 32-bit or 64-bit). Importing an address space generally implies that the translation table at the target domain needs to be populated with the translation table entries from the source domain at an appropriate offset in its virtual address space, in addition to the translation entries for its own memory.
If the imported data is mapped to a different memory region in the target domain relative to the source domain, then there are ramifications for the protocol/messages sent as part of IDC. The messages that constitute IDC will contain either relative pointers (to the base of the shared region), or they will contain absolute pointers that need to be translated in the appropriate target context.
To be effective, memory sharing based IDC assumes the use of a low-overhead address translation mechanism between the domains. Yet the above two approaches (i.e., relative pointers and absolute pointers) have shortcomings. For example, in the relative pointer approach the pointers obtained by the target domain from a source domain can be passed transparently to a third domain, but it involves the additional cost of pointer “swizzling” before data access, making the approach inefficient. This ability, i.e., transitivity across multiple domains, is critical in certain network storage systems. On the other hand, absolute pointers require translation to the target context before dereferencing (accessing the data), because the mapped memory regions are different from one domain to another. This translation is commonly referred to as pointer “swizzling” and adds overhead in the critical path of data motion.
Hence, it is believed that the prior art does not provide an efficient way to perform zero-copy transitive communication of data between multiple domains in a processing system.