The present disclosure relates generally to optimizing a communication software stack to provide error checking, and more specifically, to a common communication interface available to all layers of the communication software stack for intercommunication that provides distributed error checking using specific reply and source fields.
In general, central processing unit (“CPU”) firmware of a software stack communicates with input/output (“I/O”) device firmware of the same software stack to perform enqueue and dequeue operations in response to messages by an application program. Further, the I/O device firmware inspects each message for content validity, and if any invalid content is found, the I/O device firmware generates an error reply message with an error code. However, all information required to check for all possible error codes may not be available to the I/O device firmware, which makes it impossible for the I/O device firmware to check for all possible error codes and generate an appropriate error reply message with the appropriate error code on its own.
To account for this issue, one mechanism is to store, by the CPU firmware, all the needed information in each message's message information block (“MIB”) in hardware system area (“HSA”), which is then used by the I/O device firmware to check all the error codes. Yet, because the HSA space is limited, this mechanism requires HSA space that may not be available to the CPU firmware, which is compounded as the number of MIBs that are needed to keep track of the I/O device elements grow. A second mechanism is to allow the CPU firmware to generate error reply messages with the error codes; however, because this mechanism requires the CPU firmware to generate adapter interrupts, the size, complexity, and firmware overlap of CPU firmware code increases. Further, with this second mechanism, the CPU firmware checks all the higher priority error codes before the lower priority error codes to avoid breaking the error code priority. Then, if the error code checks by the CPU firmware do not fail, the I/O device firmware checks duplicate the same higher priority error codes that the CPU firmware already performed, which causes performance degradation. A third mechanism is to introduce additional systems architecture; however, this would also require more work by the CPU and/or I/O device firmware to generate the appropriate error reply, as well as more work by the application to add the new architecture support.