1. Field of the Invention
The present invention relates to logging events from different contexts, and more particularly, to maintaining logs of the events occurring in virtual environments, such as Virtual Machines. The present invention is also related to organizing interprocess communication as it relates to concurrent use of physical memory, and more particularly, to managing concurrent threads and processes of virtual machines working in different context, where it is undesirable for these context to block each other. More generally, the present invention is applicable to computer programs that work on multiple-processor or multi-core processor architectures, where different context needs to buffer their data into the same address space at the same time, independent of each other.
2. Description of the Related Art
With Virtual Machine (VM) technology, a user can create and run multiple operating environments on a host server at the same time. Each virtual environment, or Virtual Machine, requires its own operating system (OS) and can run execution contexts independently. The VM software provides a layer between the computing, storage, and networking hardware and the software that runs on it.
Each VM acts as a separate execution environment, which reduces risk and allows developers to quickly re-create different operating system (OS) configurations or compare versions of applications designed for different OS's for as long as the integrity of data used by each of the VMs is provided. Generally, a Virtual Machine is an environment that is launched on a particular processor (a client machine) that is running a host operating system (HOS) and is connected to a data storage located on a server that stores VM data.
Each VM can have several execution contexts with the events that need to be logged. The contexts can be Virtual Machines (VMs) and various applications. Currently, there are a number of conventional methods that relate to event logging. However, in terms of virtualization, logging of the events executed within different contexts presents considerable challenges. Recording the event logs from different contexts can trigger stopping (locking) of the contexts when the logs of the events occurring in different contexts (i.e., in different VMs) are recorded in parallel. Conventionally, a context, which writes data into a common log, needs to receive a notification that a file is available for writes. Writing into a file blocks the context, so the context log cannot be recorded without locking the context. Keeping separate logs within different contexts raises logs synchronization problem, because logs have different addresses within different contexts.
Modern computer architectures experience certain difficulties when it comes to increasing their performance and giving guarantees, using software mechanisms, for optimizing management of concurrently executed threads. Standard synchronization primitives of the processes that are being executed assumes blocking of a resource, when one process addresses a particular memory location or space, or more generally executes some code that requires that particular resource, then other processes are switched to a waiting mode.
Blocking in or itself, is already a form of slowdown of the process being executed. Since the processes need to execute sequentially, waiting for their turn once a particular process “grabs” a resource (for example, by setting a “busy” flag for the right to address a particular memory area), some of the popular primitives used to synchronize are semaphores, mutexes, and monitors. If a computer has a single processor, then a queue of processes with blocking is the only method for ordering the aggression of the memory, if different processes try to execute the access attempt at the same time. If the computer has a multi-core architecture, or has multiple processors, then such a process queue is no longer the optimal solution, since as many processes can execute as there are cores or processors in the system.
Second, the blocking approach is not always possible to apply. For example, when different context compete with each other, this approach is problematic. Thus, if the basic spinlock synchronization primitive is used, then those contexts that compete with the owner of the spinlock have to remain idle, since they also require the spinlock.
Third, the blocking algorithm can sometimes produce deadlock—a form of dead end situation, where each processes of a group awaits an event that only another process from the same group can generate. In this case, when there's a problem with a process that grabs the spinlock, all competing processes “hang”.
Fourth, there are difficulties that are particularly relevant to multi-core and multiprocessor architectures. For example, the process running on one core cannot affect or put into a queue a scheduler of another core. The scheduler, in this case, has higher priority. In this event, a likely situation will occur that the scheduler will damage the process of a neighboring core, even if that process used standard synchronization primitives.
With virtual machines, restrictions on the use of standard blocking methods for synchronizing processes are even more strict. The primary reason for this is that virtual machines all work in different context. In a virtual machine, there is a particular problem regarding logs of events, and in particular, a log of events of competing processes generally. Specifically, it is not possible to permit mutual exclusivity by writers of different context upon different context using standard methods since one possibility of a context is the host operating system, the guest operating system, a hypervisor, which might “live” in a separate context), with a possibility of blocked interrupts.
Thus, any algorithm of non-blocked synchronization has to have three levels of guarantees, from the weakest to the strongest in order:
Obstruction-free writing—if a process or thread, launched at any time, given that execution of competing threads is on hold, finish its work in a specified number of steps. Synchronization using mutexes, for example, fails to satisfy even this weakest requirement.
Lock-free operation—for example, a thread can theoretically run in an infinite loop, however, each iteration means that some other thread has completed some discrete action, in other words, the work of the system has not stopped.
Wait-free operation—where each operation is performed in a finite number of steps, which does not depend on other threads. This is the strictest guarantee of progress.
An important aspect of the present invention is guaranteeing that even where the data write procedure by a user of his data into a buffer is spread over time, other users at that time do not need to be idle, but can also write their data into the buffer.
The algorithm for working with competing contexts described in the present application satisfies all these requirements.
An important characteristic of the ring buffer, which is a key ingredient for non-blocking implementation, is the fact that when reaching the last element of the array of memory cells of a buffer, both the writer and the reader, independently of each other, return back to the first element of the array. Most of the practical use of the ring buffer is focused around the situation of a single writer and a single reader. However, these conventional algorithm exclusions cannot be scaled to a situation with multiple writers and/or multiple readers. A number of first in first out (FIFO) solutions exist for buffers located in physical memory. For example, U.S. Pat. No. 7,925,804 addresses the slowdown of data transmission from one bus to another by accumulating requests in an intermediate FIFO buffer, with a subsequent transmission of all the data for all requests as if it were a single request.
U.S. Pat. No. 8,015,367 describes working with memory given different context, by translating the address space of the context (i.e., of each virtual machine) into the host OS memory, and using a count buffer to store information about the number of entries from each context.
U.S. Pat. No. 6,904,475 contemplates the use of a FIFO buffer to temporarily store a data stream from the IEEE 1394 bus intended for IO devices, and for processing these streams prior to outputting them, based on instructions received in real time from a user application.
U.S. Pat. No. 7,945,761 describes a method and a system for maintaining validity of cached mappings corresponding to virtual addresses in guest page tables. When creating the FIFO buffer, memory mapping is used, where a region in a virtual memory of a virtual machine is made to correspond to a region in a memory of the host.
U.S. Pat. No. 7,117,481 describes a system of concurrent access to by different processes to the same resource, where the processes belong to different domains. In this patent, the semaphore synchronization primitive is used, with mutual blocking of competing processes.
U.S. Pat. No. 8,099,546 describes a mechanism for a lockless ring buffer in an overwrite mode, which includes aligning the addresses in a memory for each page of the ring buffer, in order to perform masking bits in addresses, which are used as a flag representing the state of the page, and using two least significant bits of the addresses to show the state of the flag of the page. The state can be one of three possibilities—header, update and normal. The described method includes a combined action: (a) moving the head page pointer to the head page pointer of the ring buffer, with cropping of the head page and the page being read; (b) changing the state of the flag of the head page into the normal state; (c) changing the state of the flag of the next page, after the head page, to the header state; and (d) moving the head and tail pages of the buffer, which means resetting the flags representing the states of one or more of the pointers of the buffer pages, associated with the head and tail pages.
U.S. Pat. No. 8,127,074 describes a mechanism for a reader page for a ring buffer, where a block of information from storage is separated from ring buffer storage in the form of a page, for a reader of the buffer. The ring buffer is located in physical memory, and the copying is done so that the readers' page becomes part of the ring buffer, and the head page no longer belongs to the buffer.
U.S. Pat. No. 8,271,996 describes a method of event notifications, generated by writers, for those readers who subscribe to the notifications, and without the use of kernel space. Everything is performed in user space by creating a ring buffer in shared memory. Each event can be executed in its own address space.
U.S. Patent Publication No. 2009/0204755 describes a multi-reader, multi-writer lock free ring buffer, and also describes a non-blocking algorithm for working with the ring buffer. The algorithm uses indices of writers and readers. Each writer and reader has his own reserved index value and done index value. These represent a cell that is reserved for some action, and a cell upon which the action has already been performed. The algorithm constantly compares them, and based on the comparison, moves the position pointer for writing or reading.
In this publication, the algorithm contemplates only a relatively short time for writing into the buffer by a single writer, therefore, it does not permit writing large amounts of data on each iteration, since this can lead to blocking—where other writers will have to be idle. That means that each successive writer, in this algorithm, waits for the previous writer to finish. What is needed is a more universal algorithm, where a large amount of data being written by one writer does not prevent others from writing as well.
U.S. Patent Publication No. 2010/0332755 describes a hardware and software solution to improve synchronization between threads in a system having a multi-core processor. The hardware part includes a first processor and a second processor, and a common ring buffer stored in a memory, for data storage. Also, the memory storage global variables associated with accessing the ring buffer. The first processor core launches a first thread, and has a first cache, associated with it. The first cache stores a first set of local variables associated with the first processor core. The first thread controls writing of the data into the shared ring buffer, using one global variable and a first set of local variables. These second processor core launches a second thread, and has a second cache associated with it. The second cache stores a second set of local variables associated with a second processor core. The second thread controls reading of the data from the shared buffer, using at least one global variable and a second set of local variables.
U.S. Patent Publication No. 2011/0131352 describes a method for writing into a limited ring buffer. A network adapter can determine that the data is ready for writing into the ring buffer, and after that, once the network adapter determines that the read index is not equal to the write index, then this data is ready for writing into the buffer. The network of data writes the data into the memory, which is pointed to by the write index on a physical storage medium. The memory that is pointed to by the index have an offset and the memory includes the data itself and a validity bit. The network adapter writes the time of the index entry into the validity bit, and then adds one to the entry after writing the data into the memory.
Accordingly, a method and system for recording common logs of the context events without stopping or slowing down (locking) the context execution is desired.