In a conventional (undistributed) processing architecture, when an instruction executes, it sends its result to a centralized register file and broadcasts it on a broadcast bypass bus so that any instructions waiting on that result may use it immediately.
In a distributed processor architecture, which may consist of multiple processing cores interconnected via an operand network, an instruction's encoding typically includes an identifier that indicates one or more consuming instructions that need the value. A distributed processor architecture is described, for example, in U.S. Patent Application Publication No. 2005/0005084.
When an instruction executes in a distributed processor architecture, it typically sends the resulting value only to those consuming instructions awaiting the value. This type of instruction encoding may be well-matched to a distributed architecture implementation in which the producing and consuming instructions lie on different processing cores, although certain challenges may arise when an instruction result must be sent to many consuming instructions in such an implementation. Similar challenges may arise in standard cache-coherent on-chip multi-core systems.
all arranged in accordance with at least some implementations of the present disclosure.