1. Field of the Invention
The present invention is related to a controller used in a storage field, and more particularly, to a controller capable of self-monitoring, a redundant storage system having the same, and a method thereof.
2. Description of Related Art
A redundant system means a system includes two or more particular sub-systems which is important to the system. For example, a redundant RAID (redundant arrays of independent disks) system often seen in the storage field usually has two redundant controllers. The controllers can be arranged in two kinds of configuration. One is Active-Standby, also called Active-Passive while the other is Active-Active.
Reference is made to FIG. 1, which is a schematic diagram showing the redundant controller pair in Active-Passive configuration. It has a host 11, controllers 121, 122, and a physical storage device array (PSD array) 13. Examples of a PSD are a hard disk drive or an optical disc. The controller 121 is called primary controller while the controller 122 is called secondary controller. The controllers 121, 122 are connected to at least one host 11. Thus, the host 11 can issue access requests to the controller 121 or 122.
Normally, the host 11 sends an access request to the controller 121 so as to access data stored in the PSD array 13 via the controller 121. Before the controller 121 accesses the PSD array 13, it informs the controller 122 of what it is going to do. For example, the controller 121 may inform the controller 122 that it is going to write some data into the PSD array 13. After that, the controller 122 backs up the data and records the action the controller 122 is going to perform.
Once the controller 121 fails or performs an error action, the controller 122 takes over the task of the controller 121 to write the data into the PSD array 13. Hence, when the controller 121 is broken, the controller 122 temporarily serves as the primary controller. The host 11 sends access requests to the controller 122, instead, until the controller 121 is restored or replaced by a new one.
Reference is made to FIG. 2, which is a schematic diagram showing the redundant controller pair in Active-Active configuration. It has a host 21, controllers 221, 222, and a PSD array 23. The controller 221 is called primary controller while the controller 222 is called secondary controller. The controllers 221, 222 are connected to at least one host 21. Thus, the host 21 can issue access requests to the controllers 221 and 222. The controllers 221 and 222 access the PSD array 23 respectively according to the access requests they receive.
Before the controller 221 accesses the PSD array 23, it informs the controller 222 of what it is going to do. Similarly, before the controller 222 accesses the PSD array 23, it also informs the controller 221 of what it is going to do. Hence, if one of the controllers 221 and 222 fails or performs an error action, the other temporarily takes over its task and completes the access action.
In either configuration mentioned above, i.e., Active-Active configuration or Active-Standby configuration, there must be a monitoring mechanism between the redundant controllers so that any one of the controllers can detect whether the other one operates abnormally. Reference is made to FIG. 3, which shows a schematic diagram for illustrating the monitoring mechanism between the redundant controller pair. It includes controllers 31, 32, a PSD array 33 and a common transmission interconnect 34. The controller 31 is called the primary controller while the controller 32 is called the secondary controller.
If the common transmission interconnect 34 is a small computer system interface (SCSI) interconnect, whose transmission cable has many pins, for example, 68 pins, a portion of which are unused generally, the controllers 31, 32 can employ the unused pins of the SCSI's transmission cable to send monitoring signals to each other. In this way, either one of the controllers 31, 32 can detect whether the other one operates abnormally.
For example, if the controller 31 malfunctions, the monitoring signals sent from the controller 32 cannot be replied to. So the controller 32 notifies the host (not shown) that the controller 31 is malfunctioning and temporarily takes over the functions of the controller 31.
In addition, the controllers 31, 32 access the PSD array 33 through the common transmission interconnect 34. If one of the controllers 31, 32 operates abnormally, it may abnormally access the data of the PSD array 33 and affect other normal access operations. Hence, for example, when the controller 31 finds that the controller 32 operates abnormally, it sends a reset signal to the controller 32 to reset the same so as to prevent the whole storage system from being affected by the abnormal operation.
Furthermore, if the controller 31 malfunctions and the other controller 32 does not take over its functions immediately, the host may still send access requests to the controller 31 and not receive any response. Thus, the controllers 31, 32 have to monitor each other via the common transmission interconnect 34, and the interval between any two monitoring signals sent from the controller 31 or 32 must be very short, such as several milliseconds.
A conventional monitoring signal includes multiple detecting signals. When the controller 31 or 32 receives the detecting signals, they perform a hand-shaking action. This action occupies a portion of transmission bandwidth of the common transmission interconnect 34 and degrades the efficiency of the access operations of the controllers 31 and 32. For resolving this problem, an additional transmission interface can be provided for conveying the monitoring signals of the controllers 31 and 32. However, it increases the cost and overall hardware complexity.