Within a data storage system there is typically provided an input/output system or subsystem such as input/output (I/O) module connected to one side of an interface such as a midplane. On the other side of the midplane are typically connected one or more storage devices such as hard disks or other such storage media. The I/O module serves to write data to and read data from the one or more storage devices. An AC mains power supply, the primary power source, is typically connected to the storage system to provide the necessary power for the device to operate.
There are of course other examples of data storage systems such as a processor system, e.g. an ATX server, specific silicon such as a RAID ASIC or a switch device as might be provided in a Serial Attached SCSI (SAS) expander.
Description from hereon in will be largely with reference to a storage system including one or more input/output modules but it will be appreciated that the principles discussed herein apply to all data storage systems having input/output systems or subsystems.
Some I/O modules store received data in cache memory, responding to the external host system to inform it that the data has been written to the storage media, when in fact it is still held in volatile memory within the I/O module. This data will typically be subject to some form of processing before being written to the storage media. Any cache data in volatile storage is vulnerable to loss and thus in the event of an AC mains power failure, control of the elements of the system is required to ensure that the retention of the data is consistent and reliable.
Typically, in current systems, that function is provided by batteries mounted within the I/O module. In the event of an AC mains power failure the batteries mounted within the I/O module serve to maintain RAM within the module in a self-refresh mode for a period of time. The normal period of retention is approximately 72 hours, which is sufficient to retain the data over a weekend's power outage (Friday evening to Monday morning). The retention period is of course dependant on the type and amount of RAM to be maintained.
Typically, on notification of primary power loss this type of system switches the RAM to self-refresh mode and then lets the rest of the system fail due to the loss of primary AC mains power. On restoration of power the system recognises that it has a ‘dirty cache’ and ensures that data cached in RAM is written to disk.
FIG. 1A shows a schematic representation of a conventional data storage system 2. The system 2 comprises plural, n in this case, disk drives 4 connected via a midplane 6 to two I/O modules or controllers 8. Each of the I/O modules 8 contains onboard processing capability 10 together with memory 12. The I/O modules or controllers 8 serve to provide control of data transfer between one or more hosts (not shown) and the plural disk drives 4.
A battery unit 14, containing one or more batteries, is provided on each of the I/O modules or controllers 8 to provide back-up power in the case of an interruption in power from the AC power supply units 16. As explained above, as processing capacity and data capacity increases so does the power requirement and hence the required battery size or capacity for providing back-up.
FIG. 1B shows a schematic representation of a typical ATX server system as might be used to implement a RAID system. The server system 3 comprises a number of processors 5 and power supplies 7 provided on an ATX motherboard 9. A PCI RAID card or RAID controller 11 is provided. Plural disks or drives 4 are provided to which data is written in accordance with the RAID type being implemented.
In normal use the ATX server system and PCI RAID card 11 is mains powered via the power supplies 7. The PCI RAID card 11 includes a battery 13 which provides a back-up power supply to the ATX server system in the event of failure of the mains AC power supply. As in the example described above with reference to FIG. 1A, as processing capacity and data capacity of the RAID system increases, the required battery size or capacity for providing back-up increases too.
As an alternative to battery power, some current power back-up systems use “supercapacitors” to maintain the RAM in self-refresh mode for the necessary period of time. The use of supercapacitors can be advantageous as a maintenance free alternative to internal batteries.
In addition, some manufacturers use external Uninterruptable Power Supply (UPS) systems to maintain the whole system, either for the period of outage, or until the essential data in RAM can be written to disk or non-volatile storage.
There are a number of problems with currently available systems. Due to customer requirements for increased performance, RAID 6 etc. I/O module power is increasing, and contains more memory. The increasing memory requirements mean that back-up batteries need to be larger to provide the same period of retention. However, the increasing performance and power of modules means that the I/O module is becoming very dense, with little space for any battery. Furthermore, as the performance and power of modules increases, the temperature of the environment can increase which is not particularly good for a battery. Indeed, in some situations, it is possible that the temperature can reach levels that are detrimental to the battery.
Since the battery is only providing a hold up to the memory, if the battery becomes exhausted, the back-up fails and data is lost. Degradation of battery capacity due to temperature stress can significantly reduce the hold-up time.
External UPS systems provide a similar function to a system mounted battery, but by definition are less integrated with the enclosure. This makes their control of the systems shutdown less efficient and the systems knowledge of the UPS state less reliable. Any external UPS is likely to have to provide holdup for the total system including I/O modules, fans and disk drives or whatever other such storage media are included in the system.
Any external UPS would provide AC to the enclosure which would add further conversion inefficiency to the system, thus requiring even more power.
U.S. Pat. No. 5,799,200 discloses a method and apparatus for preserving data in a system having Dynamic Random Access memories (DRAM). A Flash RAM and a small auxiliary power source are utilised by a controller independently of the system to transfer the stored contents of the DRAM to the Flash RAM immediately upon loss of primary system power.
US-A-2005/0121979 discloses battery packs with a plurality of rechargeable batteries connected in series so as to obtain a voltage required by a load device. The battery packs are for the provision of an uninterruptable power supply within a computer. The battery packs are detachably accommodated in a case. The battery packs are connected in parallel and each output thereof is modified to a predetermined voltage by a discharge control section.