1. Field of the Invention
This invention relates to a xe2x80x9cplug and playxe2x80x9d, high speed, microcomputer based, Fibre Channel compatible and fault tolerant mass storage information server system. In particular, the present invention relates to a device and method for providing an enterprise-wide information server system which incorporates a dual loop arbitrated, Fibre Channel capable, multiple-fault tolerant, hot-swappable disk array requiring no host computer.
2. Description of the Prior Art
Efforts have been made in the past to provide a mass storage file server capable of delivering information throughout an enterprise with high speed data throughput, scalable data storage capability in a convenient, easily configurable enclosure using well known, industry standard operating software. However, such systems have typically experienced many shortcomings and problems associated with the inability of presently available computer and communications hardware to sustain performance and survive failure of component devices. Such shortcomings have included the lack of capability to allow for scalability to accommodate increased storage requirements without the need to completely power down information server system to facilitate installation of additional disk storage devices or communications capability. One of the solutions presently available requires that the entire information server system be taken off-line and powered down before any additional disk storage devices can be added to the disk storage array. This and other comparable or more capable server systems require a significant amount of system administrator personnel time and server system resources to facilitate system maintenance for expansion of storage space, repair, and routine maintenance such as optimization and system health monitoring.
Other problems existing in the presently available information server systems include a general inability of the present technology to provide continuous information server capability after some predetermined number of failures have occurred in the server system. Some prior art information server systems have provided a limited fault tolerance capability. Such systems typically employ a disk array server which incorporates extra disk space substantially in excess of that needed by the enterprise serviced by the information server system. The extra disk space is incorporated into such systems with the addition of additional physical disk storage devices which are configured with a particular logical disk drive configuration tailored to meet the storage, fault tolerance and server requirements of the user.
To accomplish a desired level of fault tolerance, the disk array subsystem is configured with various types of operating system software to create duplicate and multiple copies of the data stored on the information server system across various different combinations of the physical and logical disk drives. Any of a large array of fault tolerant disk array architectures are typically employed including, for example, various implementations of what is known the trade as the redundant array of independent disks or xe2x80x9cRAIDxe2x80x9d topologies and protocols. Upon detection of a complete or partial failure of a particular disk storage device, the operating system software program notifies the server of the failure, marks the portion of the disk which failed or logically removes the completely failed disk storage device from the disk array. Next, the operating system typically reallocates the remaining available physical disk space into a modified configuration of logical disk drives.
The operating system then reconstructs new duplicate, multiple copies of the data stored on the server system within the constraints of the newly reduced amount of free disk space. The server system also alerts the server system administrator that a failure has occurred so that corrective action may be taken.
Depending on the exact nature of the failure, the necessary corrective action can include removal and replacement of the completely or partially failed disk storage device. The remove and replace system maintenance operation requires, in many systems, that the entire server system be taken off-line and powered down before the physical replacement operation can be performed. Some information server systems permit removal and replacement of the defective disk storage device while the server system remains on-line and powered on. This process is commonly referred to by the trade as xe2x80x9chot-swappingxe2x80x9d of devices. Such systems, however, require considerable hands-on intervention of the system administrator personnel to manually manipulate the hardware interfaces and operating system software for purposes of physically and logically reintegrating the newly replaced disk device into the disk array subsystem. Also, the systems capable of hot-swapping, experience severe degradation of performance resulting from the process of taking corrective action.
The process of logical reintegration requires a significant portion of the server system central processing unit and memory resources. These resources are needed to accomplish the reallocation of the newly available free disk space into the logical disk configuration of the server and the redistribution of the multiple, duplicate copies of the information stored on the server across the new and remaining physical and logical disk drives. This need for server system resources, although temporary, results in a severe decrease in the performance of the information server system.
Thus, users of presently available information server technology have generally two types of information server system options available. The first type of server system is completely unavailable for the duration of the system maintenance operation. The second type of system is, in effect, unavailable to the users due to the seriously degraded performance experienced by the information server system during the system maintenance operation. The following U.S. Patents, which are hereby incorporated by reference in their entirety, appear to disclose various types and components of the above described information server systems: U.S. Pat. Nos. 5,402,428; 5,471,099; 5,479,653; 5,502,836; 5,517,632; 5,518,418; 5,522,031; 5,530,831; 5,544,339; 5,548,712; 5,615,352; 5,651,132; 5,659,677; 5,664,119; 5,666,337; 5,680,538; 5,694,581; and 5,701,406.
As a result of the problems and shortcomings of the technology incorporated into the presently available information server systems, users are left without a satisfactory server system which is capable of, among other features, continuous uninterrupted availability, nondegraded performance and simplified, quick and easy storage space expansion, reconfiguration, repair and routine maintenance. None of the previous devices have adequately met these needs. Thus, it is apparent that a need exists for an system which not only reduces or eliminates the shortcomings and problems associated with the currently available information server systems and related technology, but also which provides an efficient and cost-effective solution to such concerns.
The present invention is an information server system with a xe2x80x9cplug and playxe2x80x9d, scalable, modular, fault tolerant, multi-loop, hot swappable architecture incorporating a central processing unit, a storage device controller connected to the central processing unit for controlling at least one storage device array and a communications interface system connected to the central processing unit for communicating with other systems. More particularly, the invention represents a plug and play storage system for information storage and retrieval applications and incorporates an on board computer server for the storage system, thus eliminating the requirement for resources from a host computer. The computer controls and communicates with a storage device controller and a communications interface with other systems external to the storage system. The storage device operates via a high speed interface to control an array of storage devices through their individual hot swap interface cards.
In one presently preferred embodiment, the invention provides a self contained plug and play information server system which incorporates a high speed, microcomputer based, server running industry standard operating system software enhanced to include functionality directed to operation of an array controller for a storage device such as a magnetic disk array, optical device array, solid state memory or the like, and which controls the physically independent or integral storage device array, and a communications interface. In a presently preferred form, the array controller subsystem controls and communicates with the storage device array with a Fibre Channel protocol and topology compatible 1.0625 gigabit per second copper compact PCI and/or a fibre optic interface bus and an Intelligent Input/Output, xe2x80x9cI2Oxe2x80x9d bus for control of and communication with the disk storage device array.
The storage device array incorporates a plurality of storage devices with a corresponding number of bypass, or xe2x80x9cbridgingxe2x80x9d, interface cards configured to facilitate the on-line addition, removal and replacement of storage devices. In addition to incorporating the above described buses and Fibre Channel capability, the storage device array further incorporates a physically independent Fibre Channel compatible optical bus for high speed communication between storage device array subsystem components, including the internal storage devices, independent from the information server. The problems encountered with previously available information server systems are solved by the present invention, which can be manufactured relatively inexpensively from a variety of off-the-shelf hardware and software, either in standard configurations or on a custom configured basis. In either configuration, a wide array of user reconfigurable options are available as well as a scalable expansion capability.
The present invention accordingly provides for an information server system with a scalable, modular, fault tolerant, hot swappable architecture, that comprises a central processing unit; an array controller subsystem connected to the central processing unit for controlling at least one storage device array; and a communications interface subsystem connected to the central processing unit for communicating with other subsystems of the information server and for controlling the array controller subsystem.
For convenience, and not by way of limitation, the invention will be described below in the context of magnetic disk storage devices as they represent readily available and compatible types of storage devices for information server applications. However, those skilled in the art will recognize that other storage devices and media such as optical disks, solid state memories or magnetic storage media would be applicable for various applications, depending on the state of development of the storage media technology and the application to which the system is to be put. Similarly, the invention will be described in the context of a Fibre Channel protocol and topology compatible 1.0625 Gigabit per second per second copper compact PCI and/or fibre optic interface bus and Intelligent Input/Output I2O bus as the communications link between the storage device controller and the storage device array, although other communication links may be used, depending on the array architecture, communications speed requirements and available technology for data links.
In one presently preferred embodiment, the information server system further comprises a midplane connector for connecting interface cards for components. In a currently preferred aspect of the invention, each disk storage device array comprises a plurality of disk storage devices and a corresponding number of bypass interface cards, all of which communicate with one another and the information server. In another presently preferred aspect of the invention, each disk storage device array comprises a predetermined number of bypass interface cards which populate the entire information server system, whether or not the entire information server system is fully populated with a corresponding number of disk devices. Each disk storage device is preferably hot-swappable, and each disk storage device is mounted on a bypass interface card that connects to the midplane connector.
In a presently preferred embodiment, the disk array controller subsystem controls and communicates with one or more disk storage device arrays with an arbitrated dual channel Fibre Channel system, and each of the disk storage devices are connected to the arbitrated dual channel Fibre Channel architecture, whereby each disk storage device may perform simultaneous reads and writes of data in response to any requests from the outside world through the information server. In another presently preferred aspect of the invention, the disk storage devices include electronic device registration devices, and the disk array controller subsystem monitors identification numbers of the electronic device registration devices. The disk array controller subsystem can thus monitor when a component in the information server system is removed or added. The electronic device registration devices are preferably integrated into an electronic circuitry of each of the disk devices such that engagement or disengagement of each disk storage device with the disk storage device array causes a triggering of the electronic device registration devices to generate and transmit a unique identifying serial number signal unique to each disk storage device to the disk array controller subsystem. A triggering of the electronic device registration devices preferably causes the electronic device registration devices to generate and transmit a unique identifying serial number signal unique to each disk device to the disk array controller subsystem, and upon receiving the signal, the disk array controller subsystem then immediately either initiates logical connection or disconnection of the disk device to or from the array, depending on whether the disk device has been engaged or disengaged, respectively. The disk array controller subsystem can thus accomplish the electrical and logical connection and disconnection of the disk device by control of the bypass interface cards. In another presently preferred aspect, the bypass interface cards comprise an independent but logically integrated optical bus for communication within the Fibre Channel topology and protocol between disk drives.
In another presently preferred embodiment, the disk array controller subsystem is adapted to configure one or more of the disk storage devices for a configuration selected from the group consisting of RAID 0, RAID 1, RAID 3, RAID 5, RAID 10 and an XOR RAID configuration.
In another presently preferred aspect of this invention, an operator may activate a so-called xe2x80x9chot buttonxe2x80x9d which will disable the write function to the array, thus preventing the writing of suspect data on the storage array after a system fault has been detected.
From the above, it may be seen that the present invention provides a novel, plug and play, high speed scalable and modular fault tolerant information server architecture which offers many benefits over prior art systems. Other features and advantages of the present invention will become apparent from the following detailed description of the invention, taken in conjunction with the accompanying drawings, which illustrate, by way of example, the principles of the invention.