The most common application of sharing of digital hardware relates to time-shared computers. This well known method allows a single CPU to be shared among multiple processes (the terms "users", "channels" or "tasks" may be used interchangeably for "processes"). In most applications, the contents of the CPU registers are "swapped" out to RAM during a context switch. Typically, this switch takes many clock cycles, i.e., several clock cycles for each register being saved, where there may be 20 or more registers saved.
More specific to the present invention are applications involving the sharing of pipelined parallel digital hardware, as shown in FIG. 4 illustrating alternating processing devices P and registers R (e.g., a flip-flop). In these applications, RAM is attached directly to each register in the device to allow more rapid context switching. Originally, these systems are presumed to have used two clock cycles for switching, one for writing intermediate results out to RAM and one for reading results back from RAM before continuing processing. In this case, it is possible to process a single sample from one user before switching, but the associated overhead for switching was quite significant (200%). Systems that could supply multiple samples from a single user before requiring a switch benefited from much lower overhead as a result.
In the pending PCT patent application PCT/US97/16349 filed Sep. 19, 1997, entitled "Demodulation of Asynchronously Sampled Data by Means of Detection-Transition Sample Estimation in a Shared Multi-Carrier Environment" by James R. Thomas and Soheil I. Sayegh, a method and device are described for sharing registered digital hardware with only one clock cycle time required for switching overhead. This method can also process one or more samples from a single user before switching. In the case of a single sample, the overhead is then 100%.
However, none of the known conventional methods achieve zero overhead switching and one sample per user between switches. This particular feature is of special interest in digital communications because of the desire to use so-called "polyphase filters" to demultiplex communications channels for processing by digital demodulators. Polyphase filters inherently put out one sample per channel before switching. Other means of demultiplexing, specifically FFT-IFFT processors, naturally produce many samples per channel and are thus well suited for the prior means of sharing, although reducing overhead is still desirable. Alternatively, memory buffers may be used ahead of the equipment to be shared to group samples for processing, but they add significant size and power requirements to the overall processing system. In addition, zero overhead switching has significant application in modern microprocessors, replacing the first method described above, and greatly speeding context switching.
Another conventional alternative to support this type of sharing is the use of a register bank wherein a register is provided for every user, channel, or process at every stage of processing. This approach is simple to implement and supports both zero-overhead switching and one sample per user. However, for large numbers of users, it very inefficient in terms of hardware (e.g., in an ASIC) because as many as 10 gates per bit are required for implementation. For example, in the case of 100 users, an 8 bit data path and 20 stages in the pipeline, the register bank would require 160,000 gates=(20 stages).times.(10 gates/bit).times.(8 bits).times.(10 users).