A well-known problem in electrical systems is the possibility of electrical power outages. Some electrical systems use batteries for additional power back up.
An example of a system that can use a battery for back up in case of a power outage is a RAID system. RAID stands for Redundant Arrays of Inexpensive Disks. RAID systems have two or more disk drives that cooperate to increase performance and fault tolerance. Typically, RAID systems include a host computer, an array controller, and disk drives. The array controller serves as an interface between the host computer and the disk drives connected to the array controller. The host computer writes data to and reads data from any of the disk drives via the array controller.
When, for example, the host computer needs to write data to disk drives, the array controller receives that data from the host computer and stores the data in a disk RAM (Random Access Memory). The array controller then takes this data from the disk RAM and writes it to a single disk drive, or even to multiple disk drives.
The entire RAID system is powered by a main power supply. Since that power supply can fail, a RAID system may have a battery for back up to power the RAID system while the main power supply is down. An example of a battery back up is a UPS (Uninterruptable Power Supply).
Generally, during normal operation, a host computer will, for instance, write streams of data to the array controller for storing the data on disk drives. The array controller, having received a data stream, acknowledges receipt thereof to the host computer by sending a "COMMAND COMPLETE" message to the host computer. However, typically, instead of immediately writing the data to the disk drives, initially the array controller stores the data stream in disk RAM. A disk drive to which data is to be written may be busy storing other data sent to it previously. Consequently, data in disk RAM may be held there by the array controller until a particular disk drive is available for being written to. Later, when a disk drive is available, the array controller takes the data from the disk RAM to distribute it appropriately to disk drives.
Some RAID systems notify the host computer of a write having been completed only once the data actually has been written to a disk drive. This delayed confirmation of writes avoids the possibility that a host computer may record a write transaction as having been completed, which actually was never completed. The write transaction may never have been completed, because a power failure (or even a UPS failure) occurred while the array controller was writing the data to the disk drive or, for example, the disk drive was writing the data to a disk.
The following paragraphs illustrate the problem that a power outage can create for a database that relies on a RAID system. For instance, an accounting system may use several databases that are stored on disk drives. One database may contain a parts inventory. Another database may contain customer records. A third database may contain invoices.
An accounting application computer program stored on the host computer could use these databases as follows. A customer orders a part, and the part is shipped to the customer. Then the accounting system database needs to update the parts inventory, the customer records, and the invoice databases to reflect the changes. Consequently, the application program sends data reflective of a change in inventory to the array controller for storage on the disk drives, so that the inventory database reflects that there is one less part available, because the part was shipped to the customer. Similarly, the application program would write data to the array controller for updating the customer records database showing that the customer now owes an increased amount, i.e., the increase would be the cost for the parts shipped to the customer. Finally, the application program would direct the host computer to send data to the array controller reflecting that an invoice has been sent to a customer, if that actually had occurred.
However, a power failure may have occurred between the actual time that the customer records were updated and the parts inventory database was updated, i.e., parts inventory data was written to the disk drives and stored by the disk drives, and the time that the invoices databases would have been updated. The data to be written to the invoices database may have been in disk RAM at the time that the power failure occurred and consequently be lost. Sometime later, when the power to the RAID system is re-established, the application program may find that the invoices database had not been updated and may therefore cause shipping twice the invoice to the customer.
To avoid loss of data, some RAID systems back up the disk RAM with a battery all of the time, in addition to using a UPS. Backing up disk RAM with a battery permits the array controller to complete transactions which would otherwise be interrupted by a power outage, once the UPS fails without warning, as further explained below. However, backing up disk RAM with a battery can be expensive. Disk RAMs typically have four to eight megabytes storage capability. After a power failure occurs, continuously powering such disk RAMs with a separate battery for all reads and writes of data can require a significant size battery, depending on the length of time for which the back-up battery power is required.
A UPS can notify a system that power has shut down and that power is available from the UPS only for a finite amount of time. That finite time is a function of the size of the battery of the UPS. However, the time estimate provided by the UPS may be incorrect. Consequently, the UPS may run out of power before the estimated time provided by the UPS. So, data being written to disk drives may be lost without additional battery back-up for the disk RAM, because when the UPS fails, the array controller may have been writing data to the disk drives. But, as stated above, back-up batteries for disk RAM can be expensive.