There is very great demand for high-speed stable storage. Disks provide stable storage, but latency and transfer times can be high.
Non-volatile random-access memory (NVRAM) can be use to improve performance in a number of ways to improve response time and data reliability in server appliances. NVRAM may consist of random-access memory that does not require power to retain data or Dynamic Random-Access Memory (DRAM) or Synchronous DRAM (SDRAM) that has secondary power such as battery or an external universal power supply (UPS).
One such prior-art application is shown in FIG. 1. The host computer 11 may write important data to disks 17. When time is critical, it may instead store data to the faster NVRAM device 12. The DMA memory controller 18 manages the NVRAM 19 and provides direct memory access (DMA) services. DMA is used to transfer data in either direction between host memory 15 and NVRAM 19 across an industry-standard peripheral component interconnect (PCI) bus 13. DMA performs transfers while the host computer 11 performs other operations, relieving the host computer 11 of those duties. The data stored in NVRAM 19 may be a cache of data that will eventually be written to disks 17, a journal of changes to the disks 19 that may be replayed to recover from a system failure but which never needs to be written to disks 17, or other information about transactions that may eventually be processed causing related data to be written to disks 17.
This application allows the host computer 11 to directly control the NVRAM device 12, but it does not allow the NVRAM 19 to be used together efficiently with the disks 17. Data moving from NVRAM to disk must pass through the primary bus 13. This can reduce performance because the bus must be shared with other device transactions. Another disadvantage of this scheme is that NVRAM device 12 requires its own location on the primary bus 13 rather than sharing one with the controller for the disks 17. Locations on the bus often are not easily made available.
FIG. 2A shows a prior-art implementation in which NVRAM is attached to a storage device. The host computer 100 is attached to a disk controller 101 by an interface 104, possibly a PCI bus. The disk controller is attached to a disk or other storage device 102. The interface 105 may be a local bus such as Small Computer System Interface (SCSI) or AT-attached (ATA). The disk 102 may also be replaced by an intelligent storage device such as network-attached storage (NAS) or a storage area network (SAN) device. In this case interface 105 may be a network or fibre channel connection. The NVRAM 103 is under complete control of the disk or storage device 102. The host computer 100 has no way to access the NVRAM contents using interface 105.
FIG. 2B is similar to FIG. 2A except that the NVRAM 203 has moved to the disk controller 201. The disk controller may manage disks 202 as a JBOD (Just a Bunch of Disks) or a RAID (Redundant Array of Independent Disks) system. When the host computer 200 makes a request to the disk controller 201, the controller may choose to cache data in the NVRAM 203. Management of the NVRAM is the responsibility of the disk controller. This includes algorithms for deciding when data cached in NVRAM will be transferred to disk and when it will be discarded.
The solutions in FIGS. 2A and 2B solve the problem of keeping the NVRAM data close to the disks, but they take control of the NVRAM away from the host computer. Usually the host computer has a much better idea of how data is being used than does the disk or the disk controller. The host can know if data is temporary in nature and never needs to be copied to disk. The host can know if the data is likely to be modified again soon and thus disk accesses can be reduced if the data is not immediately copied to disk. The host can know if data will no longer be needed and can be removed from cache once it is on disk.
In all of the three above prior-art systems, when the original device that contains the NVRAM (the host computer, the disk, or the disk controller) fails, the NVRAM keeps data preserved. When the original device is restored to use, it can get the data from the NVRAM. Until the original device is restored to use, the data remains unavailable. It is often desirable to have a replacement device in system (such as a network cluster) take over when a similar original device fails. Because the NVRAM data is not available while its original device is not available, the replacement device cannot take over the function of the original device because it cannot get access to the necessary data.
There are other prior art applications that utilize bus bridges. These bus bridges often include local memory that is a subset of the bridge. FIG. 3 illustrates a host computer 250 that connects to one or more devices 252 through a PCI bus bridge. Information on PCI bus 254 is forwarded by the bridge 251 to PCI bus 255 as necessary to reach the target device 252. Information on PCI bus 255 is forwarded by the bridge 251 to PCI bus 254 as necessary to reach the host computer 250. The PCI bridge 251 may use local bridge memory 253 temporarily to store the data that flows through the bridge. Data coming from bus 254, for example, may be stored in the bridge's memory until bus 255 is available and device 252 is ready to receive the data. This memory is used by the PCI bridge 251 to make its routing function more efficient. There is no way for the host computer 250 to directly control this memory, specifically where the bridge 251 puts this data or when it is removed from memory 253. From the perspective of the host computer 250, it is writing the data directly to the device 252 except for a time delay in having the data reach the device. While the present invention utilizes some of these same bus bridge devices with associated local memory, it should be noted that the local bus bridge memory 253 is a subset of the bridge that is transparent to the host computer. This is unlike NVRAM 19 in FIG. 1 or NVRAM 309 in FIG. 4, which are endpoint devices that can be directly controlled by the host computer.
Accordingly, it is an object of the present invention to provide NVRAM that may be fully controlled by the host computer.
Another object of the present invention is to provide NVRAM on the host computer, and after a failure of said host computer such NVRAM could still be accessed by other host computers. In particular, the NVRAM and associated communication devices and processors must remain functioning when the host computer is unavailable due to internal failure, scheduled shutdown, power failure, or other errors.
Another object of the present invention is to provide NVRAM in a highly available system that can protect components of the system from failure of other components of the system. Failure of components on the NVRAM device must be detected and be able to be isolated to protect the host computer. Failure of components of the host computer must be able to be isolated to allow the NVRAM device to communicate data despite such failures.
Another object of the present invention is to provide NVRAM in the host computer, and said NVRAM can share a primary bus connection to the host computer with another device needed in the host computer.
Another object of the present invention is to provide NVRAM that can be connected to disk controllers by private data paths.