1. Field of the Invention
The present invention relates, generally, to communications between tightly coupled processors and, more particularly, to a queue register configuration structure for enabling cooperative communications between tightly coupled processors.
2. Description of Related Art
Host bus adapters (HBAs) are well-known peripheral devices that handle data input/output (I/O) operations for host devices and systems (e.g., servers). A HBA provides I/O processing and physical connectivity between a host device and external data storage devices. The storage may be connected using a variety of known “direct attached” or storage networking protocols, including fibre channel (FC) and Internet Small Computer System Interface (iSCSI). HBAs provide critical server central processing unit (CPU) off-load, freeing servers to perform application processing. HBAs also provide a critical link between storage area networks (SANs) and the operating system and application software residing within the server. In this role, the HBA enables a range of high-availability and storage management capabilities, including load balancing, SAN administration, and storage management.
FIG. 1 illustrates a block diagram of a conventional host system 100 including a HBA 102. The host system 100 includes a conventional host server 104 that executes application programs 106 in accordance with an operating system program 108. The server 104 also includes necessary driver software 110 for communicating with peripheral devices. The server 104 further includes conventional hardware components 112 such as a CPU and host memory such as read-only memory (ROM), hard disk storage, random access memory (RAM), cache and the like, which are well known in the art. The server 104 communicates via a host bus (such as a peripheral component interconnect (PCI or PCIX) bus) 114 with the HBA 102, which handles the I/O operations for transmitting and receiving data to and from remote storage devices 116 via a storage area network (SAN) 118.
In order to further meet the increasing demands of I/O processing applications, multi-processor HBA architectures have been developed to provide multi-channel and/or parallel processing capability, thereby increasing the processing power and speed of HBAs. These multiple processors may be located within the controller chip. FIG. 2 illustrates an exemplary block diagram of a HBA 200 including a multi-processor interface controller chip 202. The interface controller chip 202 controls the transfer of data between devices connected to a host bus 204 and one or more storage devices in one or more SANs. In the example embodiment illustrated in FIG. 2, the controller chip 202 supports up to two channels A and B, and is divided into three general areas, one area 232 for channel A specific logic, another area 206 for channel B specific logic, and a third area 208 for logic common to both channels.
Each channel on the controller chip 202 includes a serializer/deserializer (SerDes) 210 and a protocol core engine (PCENG) 212 coupled to the SerDes 210. Each SerDes 210 provides a port or link 214 to a storage area network. These links may be connected to the same or different storage area networks. The PCENG 212 may be specific to a particular protocol (e.g, FC), and is controlled by a processor 216, which is coupled to tightly coupled memory (TCM) 218 and cache 220. Interface circuitry 222 specific to each channel and interface circuitry common to both channels 224 couples the processor 216 to the host (e.g. PCI/PCIX) bus 204 and to devices external to the controller chip 202 such as flash memory 226 or quad data rate (QDR) SRAM 228.
When data is transferred from a device on the host bus 204 to a storage device on the link 214, the data is first placed in the QDR SRAM 228 under the control of the processor 216 that controls the link. Next, the data is transferred from the QDR SRAM 228 to the link 214 via the common interface circuitry 224 and channel-specific interface circuitry 222, PCENG 212 and SerDes 210 under the control of the processor 216. Similarly, when data is transferred from the link to the device on the host bus 204, the data is first transferred into the QDR SRAM 228 before being transferred to the device on the host bus.
In the example of FIG. 2, devices coupled to the host bus 204 can communicate with the interface controller chip 202 in three different modes. A first mode supports either channel A or channel B communications, a second mode supports both channel A and channel B communications working independently, and a third mode supports channel A and channel B communications working cooperatively. The interface controller chip 202 can be configured using straps or pins external or internal to the interface controller chip 202 to operate in any of these modes.
In the first mode (single channel operation), channel A may be operable with channel B completely reset. In this mode, only the channel A link is available. From the perspective of devices connected to the host bus 204, only a single channel A function is seen. Alternatively, the first mode can be configured such that channel B is operable with channel A completely reset. In this mode, only the channel B link is available. From the perspective of devices connected to the host bus 204, only a single channel B function is seen.
In the second mode (dual channel operation), both channel A and B are operable, both links are active, and devices connected to the host bus 204 sees the HBA 200 as a multifunction device. From the perspective of devices connected to the host bus 204, both channel A and channel B functions can be seen. In this mode, channel A and B operate completely independent of each other with regard to programs residing in memory. For instance, channel A could be controlled by one program in firmware and channel B could be controlled by another program in firmware. Alternatively, both channel A and B could be controlled by the same program. Nevertheless, there is no communication between channel A and channel B. Each channel can be reset independently and restarted independently, with no effect on the other channel or the common circuitry.
In the third mode (combined channel operation), both channel A and B are operable, both links are active, but the two channels cooperate as a single function. Although both channels are active and have completely different programs, the programs cooperate with each other to divide up and carry out activity. However, the actual control of each external link is private to its associated channel. For example, in FIG. 2 channel A (i.e. processor A) controls link A, and therefore before channel B (i.e. processor B) can send data out over link A, channel B would have to communicate with channel A to do so. Each channel can make requests of the other, and therefore the channels are not truly separate channels. From the perspective of devices connected to the host bus 204, only a single multilink channel is seen.
As noted above, in the combined channel operation, cooperative internal processing is required to carry out a single combined effort. As illustrated in FIG. 2, the circuitry dedicated to a particular channel includes a CPU 216, cache 220 and TCM 218 connected to the CPU 216. Each CPU 216 (e.g. a RISC processor) communicates with dedicated interface circuitry 222 through a processor private bus such as, for example, an ARM high speed bus (AHB) 230.
The third combined channel mode is sometimes referred to as the “supercharged” mode because twice the processing power can be obtained for a single set of commands. However, the supercharged mode creates other issues because communications are no longer “vertical” (limited to communications via the link 214 or through the common interface circuitry 224). In the combined channel mode the two channels now must communicate “horizontally” (communicate with each other and request services), and must cooperate with minimum interference with each other when utilizing common circuitry.
For example, one area of conflict is utilization of the QDR SRAM 228. When the QDR SRAM 228 is accessed, two memory words are processed every system clock, one on each clock edge. The quad data rate is achieved because there is a separate read port and write port. When transferring data from the link to a device connected to the host bus 204, the data is first transferred from the link to the QDR SRAM 228, and then from the QDR SRAM 228 to the device. Thus, every data transfer has a write phase followed by a read phase, and the number of QDR SRAM read and write operations is therefore equal. The interface controller chip 202 actually handles two streams of data for each link. For each of the two links 214, receive data is transferred from the link into the QDR SRAM 228, and then the receive data is transferred from the QDR SRAM 228 to the target device connected to the host bus 204. Also, for each of the two links 214, transmit data is transferred from the device to the QDR SRAM 228, and then the transmit data is transmitted from the QDR SRAM 228 to the link. In addition, each processor 216 may perform other operations such as reading instructions from the QDR SRAM 228 or monitoring certain data to track the status of the QDR SRAM 228. Although most data is processed out of the cache 220, when cache misses occur then the data must be accessed from the QDR SRAM 228. Because both processors 216 must share the same QDR SRAM 228 for the above-described operations, conflicts can arise.
Multi-channel conflicts with the QDR SRAM 228 may be reduced by utilizing an addressing scheme that includes address and channel, so that even if both processors use the same address, each processor will be accessing a different part of the QDR SRAM memory assigned to that channel. However, in the combined channel mode a different addressing scheme may be utilized in which each processor sees the entire QDR SRAM address space as a single address space. In this mode, the fact that the two processors can share the same QDR SRAM 228 and use the same address to get to the same memory location means that unless the two channels or processors have some means of communication, conflicts will arise.
The second area of conflict is in the flash memory 226. The flash memory 226 is a normally read-only memory that can be updated to store newer versions of programs or data. Each processor 216 has access to this flash memory 226 for functions such as reading boot code at wake-up time, or reading maintenance routines for updating the flash. Thus, there may be a number of requesting processors 216 for the flash 226, whether the interface controller chip 202 is operating in a single channel, dual channel, or combined channel mode.
As long as each requestor performs read operations only, no conflicts arise. However, if one requestor wants to update the flash 226 or read characteristics of the flash 226, rather than read the actual content of the flash 226, then the flash 226 goes into a mode that provides status information, not memory information. When this occurs, other requests for reading the flash must be blocked off, otherwise erroneous information may be provided.
Thus, a need exists for an apparatus and method that enables two or more tightly coupled processors to access and/or update the same QDR SRAM or flash or other resource, minimize conflicts with shared resources, request services that each processor cannot perform independently, or request sole use of a resource.