Digital electronic devices often rely on shared access to a single computational resource, for example a mathematical calculation unit (e.g. to calculate trigonometric functions, perform fast multiplication, etc.), a search algorithm (e.g. a special purpose hash function, a binary tree search, etc.), and the like. The main reasons for relying on a shared resource are that it is generally too expensive to duplicate a complex resource, and even if such duplication is possible this may in itself cause coherency issues, especially if multiple devices attempt to update a resource at the same time (e.g. deadlock issues, stale data, etc.).
In a typical digital electronic device, access to a shared resource is via a common bus, which is managed by a set of bus protocols. These protocols regulate when data (in the form of a service request) can be written to the resource, and provide an acknowledgement of the request once it has been accepted. There exists a problem however, when multiple devices require very fast access to a single computational resource.
Some digital electronic devices include multiple digital components which require fast efficient access to a shared resource. In this situation, the standard prior art bus protocol schemes are often inadequate. Such schemes typically prioritize requests and make one or more devices wait until a first access is completed by a first device.
Other digital electronic devices are specifically designed to perform digital processing operations in parallel by using parallel execution units. In many instances, it is advantageous that such units share a single resource in order to access a common function, operation or data structure. Prior art protocol schemes would defeat the objective of performing digital processing operations in parallel. Prior art bus protocol and access schemes to a shared resource would have access by the requesters to the shared resource occur serially, with one access blocking other accesses until it completes.
Thus, there exists a problem with respect to the sharing of a single resource between multiple accessers (or requesters). Provided the shared resource has sufficient bandwidth to handle multiple requests from multiple devices within a given time frame, it is desirable that requesters are not stalled waiting for the single resource to fulfill a request. The reason for this is that it is likely that the requesters have other operations to perform, and it is inefficient to stall a requester while it waits for its request to be accepted (which may take a number of clock cycles, depending on the number of other instantaneous requesters). It is even more inefficient to stall a requester while it waits for a result from the resource (which may take many clock cycles). These conditions regularly arise when the requesters are asynchronous, such that each requester can generate a request at any time.
One prior art solution to the problem is to implement a buffer, for example a FIFO, on the input to the shared resource such that requests are temporarily held until they can be processed. However, if the buffer can only accept one request per clock cycle, then the system is still forced to use an arbitration process and an acknowledgement protocol from the resource to each requester, which may again result in the stalling of requesters until the request can be stored in the buffer. This solution also adds additional complexity to each requester.
Thus the prior art is problematic in that systems are often constrained in situations where parallel execution units require access to a shared resource. Additionally, the prior art is problematic in that even when requests from multiple devices are buffered, such requests can only be buffered one at a time, still forcing the multiple devices to wait their turn as requests from other devices are buffered. Both of these situations act to unnecessarily stall requesters, thus causing system inefficiencies.