1. Field of the Invention
The present invention generally relates to the field of multi-port arrays within integrated circuit (IC) chips of computer systems. More particularly, the present invention relates to methods and arrangements for repairing ports such as read ports and write ports of multi-port arrays that fail as a result of, e.g., hard, uncorrectable failures.
2. Description of the Related Art
The competitive nature of industries has increased reliance on computer systems to perform daily operations, increasing the demand for fast and reliable computer systems with reasonable size and space requirements. The speed, or processing power, of computer systems in the same or smaller packages has led contemporary computer designs toward smaller IC chips that operate at higher frequencies, inherently increasing power densities within the IC chips. However, the higher frequencies and increased power densities also decrease reliability.
The decreased reliability has led many manufacturers toward autonomic computing designs. Autonomic computing refers to computer systems that configure themselves to changing conditions and are self healing in the event of failure. For instance, if one server in a rack of servers fails, the workload for the failed server may be shifted to another server in the rack, allowing operations to continue, albeit, possibly with lower processing capability and, potentially, at a slower processing rate. Nonetheless, fewer failures are catastrophic and less human intervention is required for routine operation.
Autonomic designs may also be incorporated on the IC chip level by incorporating redundant systems of subcomponents for subcomponents that tend to fail such as ports of arrays like register files. For example, ports of registers or register files within instruction pipelines of processors tend to fail more often within increases in power densities and frequencies.
Register files typically refer to combinations registers that store instructions for execution by execution units within a processor. Register files include multiple ports to allow other devices to write to and read from the registers or array locations. When an instruction is written to write ports of a register file, logic within the register file interprets part of the instruction, referred to as an operand, to determine what to do with the remainder of the instruction. For instance, an operand of an instruction may include instructions for the register file to initiate logic to store data of another operand to an address within the register file and/or to retrieve data from an address within the register file.
Instructions may also include operands for execution by other devices such as the execution units. Arrays such as register files communicate with the execution units via read ports. Read ports typically include a buffer to maintain operands and logic of the register file may direct operands of an instruction from the write ports to the read ports for execution by execution units. For example, an instruction having multiple operands may be written to write ports of a register file. In response, logic of the register file may forward one or more of the operands to read ports to transmit the operands to an execution unit. After the execution unit processes the operands, the resulting data may be written back to the register file at the same time that a target address is written to write port of the register file. The target address may indicate a location within the register file to store the resulting data so the resulting data may be accessed by instructions subsequently received by the register file.
However, when a port such as a write port or a read port fails, the failed port may erroneously interpret operands or provide erroneous operands to execution units. As a result instructions transmitted to the register file will execute improperly if at all. To avoid such situations, processor designs implement error detection logic to check the parity of instructions received via write ports and to incorporate parity bits in operands forwarded to read ports. A mismatch between an expected parity bit and a received parity bit may indicate an erroneous data transmission possibly caused by a failed port. Instruction pipelines that utilize a failed port can then be turned off, which significantly impacts processing capacity and capabilities.
An alternative solution, which avoids such a significant impact on processing capacity and capabilities, involves provision of redundant ports through which operands or data may be routed. Redundant ports are ports that sit idle until a port fails. When a port fails, each redundant port substitutes for one failed port of the same type. In particular, one redundant write port can take over the functionality of one failed write port and one redundant read port can take over the functionality of one failed read port. Thus, the processor can continue to route instructions through a pipeline until the number of failed read ports or write ports exceeds the number of the redundant read ports or write ports, respectively.
Adding redundant ports in an IC chip such as a processor, however, can significantly impact costs of manufacture and performance by adding a significant amount of wire to the IC chip. Depending upon the number of metallization layers available within, e.g., a processor, adding redundant ports can involve a linear expansion of silicon area, which significantly impacts the costs of manufacturing the processor and the speed with which instructions can be processed by the processor. For example, increasing the number of ports in an array from three ports to four ports may increase the silicon area utilized by a register file by approximately four-thirds and slow down the corresponding pipeline's processing speed by approximately the square root of that ratio, or 15 percent. As a further illustration, adding two additional read ports and one additional write port would increase a three port array to a six port array, substantially doubling the area consumed and reducing the processor's performance by approximately 30 percent.
Therefore, there is a need for methods and arrangements for repairing ports with reduced impact on performance and area requirements.