1. Technical Field
The present invention relates generally to systems and methods for data storage and retrieval and, more particularly, to data storage controllers employing lossless and/or lossy data compression and decompression to provide accelerated data storage and retrieval bandwidth.
2. Description of the Related Art
Moderm computers utilize a hierarchy of memory devices. To achieve maximum performance levels, modem processors utilize onboard memory and on board cache to obtain high bandwidth access to both program and data. Limitations in process technologies currently prohibit placing a sufficient quantity of onboard memory for most applications. Thus, in order to offer sufficient memory for the operating system(s), application programs, and user data, computers often use various forms of popular off-processor high speed memory including static random access memory (SRAM), synchronous dynamic random access memory (SDRAM), synchronous burst static ram (SBSRAM). Due to the prohibitive cost of the high-speed random access memory, coupled with their power volatility, a third lower level of the hierarchy exists for non-volatile mass storage devices.
Furthermore, mass storage devices offer increased capacity and fairly economical data storage. Mass storage devices (such as a xe2x80x9chard diskxe2x80x9d) typically store the operating system of a computer system, as well as applications and data and rapid access to such data is critical to system performance. The data storage and retrieval bandwidth of mass storage devices, however, is typically much less as compared with the bandwidth of other elements of a computing system. Indeed, over the last decade, although computer processor performance has improved by at least a factor of 50, magnetic disk storage performance has only improved by a factor of 5. Consequently, memory storage devices severely limit the performance of consumer, entertainment, office, workstation, servers, and mainframe computers for all disk and memory intensive operations.
The ubiquitous Internet combined with new multimedia applications has put tremendous emphasis on storage volumetric density, storage mass density, storewidth, and power consumption. Specifically, storage density is limited by the number of bits that are encoded in a mass storage device per unit volume. Similarly mass density is defined as storage bits per unit mass. Storewidth is the data rate at which the data may be accessed. There are various ways of categorizing storewidth in terms, several of the more prevalent metrics include sustained continuous storewidth, burst storewidth, and random access storewidth, all typically measured in megabytes/sec. Power consumption is canonically defined in terms of power consumption per bit and may be specified under a number of operating modes including active (while data is being accessed and transmitted) and standby mode. Hence one fairly obvious limitation within the current art is the need for even more volume, mass, and power efficient data storage.
Magnetic disk mass storage devices currently employed in a variety of home, business, and scientific computing applications suffer from significant seek-time access delays along with profound read/write data rate limitations. Currently the fastest available disk drives support only a sustained output data rate in the tens of megabytes per second data rate (MB/sec). This is in stark contrast to the modem Personal Computer""s Peripheral Component Interconnect (PCI) Bus""s low end 32 bit/33 Mhz input/output capability of 264 MB/sec and the PC""s internal local bus capability of 800 MB/sec.
Another problem within the current art is that emergent high performance disk interface standards such as the Small Computer Systems Interface (SCSI-3), Fibre Channel, AT Attachment UltraDMA/66/100, Serial Storage Architecture, and Universal Serial Bus offer only higher data transfer rates through intermediate data buffering in random access memory. These interconnect strategies do not address the fundamental problem that all modern magnetic disk storage devices for the personal computer marketplace are still limited by the same typical physical media restrictions. In practice, faster disk access data rates are only achieved by the high cost solution of simultaneously accessing multiple disk drives with a technique known within the art as data striping and redundant array of independent disks (RAID).
RAID systems often afford the user the benefit of increased data bandwidth for data storage and retrieval. By simultaneously accessing two or more disk drives, data bandwidth may be increased at a maximum rate that is linear and directly proportional to the number of disks employed. Thus another problem with modern data storage systems utilizing RAID systems is that a linear increase in data bandwidth requires a proportional number of added disk storage devices.
Another problem with most modern mass storage devices is their inherent unreliability. Many modern mass storage devices utilize rotating assemblies and other types of electromechanical components that possess failure rates one or more orders of magnitude higher than equivalent solid-state devices. RAID systems employ data redundancy distributed across multiple disks to enhance data storage and retrieval reliability. In the simplest case, data may be explicitly repeated on multiple places on a single disk drive, on multiple places on two or more independent disk drives. More complex techniques are also employed that support various trade-offs between data bandwidth and data reliability.
Standard types of RAID systems currently available include RAID Levels 0, 1, and 5. The configuration selected depends on the goals to be achieved. Specifically data reliability, data validation, data storage /retrieval bandwidth, and cost all play a role in defining the appropriate RAID data storage solution. RAID level 0 entails pure data striping across multiple disk drives. This increases data bandwidth at best linearly with the number of disk drives utilized. Data reliability and validation capability are decreased. A failure of a single drive results in a complete loss of all data. Thus another problem with RAID systems is that low cost improved bandwidth requires a significant decrease in reliability.
RAID Level 1 utilizes disk mirroring where data is duplicated on an independent disk subsystem. Validation of data amongst the two independent drives is possible if the data is simultaneously accessed on both disks and subsequently compared. This tends to decrease data bandwidth from even that of a single comparable disk drive. In systems that offer hot swap capability, the failed drive is removed and a replacement drive is inserted. The data on the failed drive is then copied in the background while the entire system continues to operate in a performance degraded but fully operational mode. Once the data rebuild is complete, normal operation resumes. Hence, another problem with RAID systems is the high cost of increased reliability and associated decrease in performance.
RAID Level 5 employs disk data striping and parity error detection to increase both data bandwidth and reliability simultaneously. A minimum of three disk drives is required for this technique. In the event of a single disk drive failure, that drive may be rebuilt from parity and other data encoded on disk remaining disk drives. In systems that offer hot swap capability, the failed drive is removed and a replacement drive is inserted. The data on the failed drive is then rebuilt in the background while the entire system continues to operate in a performance degraded but fully operational mode. Once the data rebuild is complete, normal operation resumes.
Thus another problem with redundant modern mass storage devices is the degradation of data bandwidth when a storage device fails. Additional problems with bandwidth limitations and reliability similarly occur within the art by all other forms of sequential, pseudo-random, and random access mass storage devices. These and other limitations within the current art are addressed by the present invention.
The present invention is directed to data storage controllers employing lossless or lossy data compression and decompression to provide accelerated data storage and retrieval bandwidth. In one aspect, a data storage controller comprises a digital signal processor (DSP) comprising a data compression engine (DCE) for compressing data stored to the data storage device and for decompressing data retrieved from the data storage device; a programmable logic device, wherein the programmable logic device is programmed by the digital signal processor to instantiate a first interface for operatively interfacing the data storage controller to the data storage device and to instantiate a second interface for operatively interfacing the data storage controller to a host; and a non-volatile memory device, for storing logic code associated with the DSP, the first interface and the second interface. The data storage controller further comprises a cache memory device for temporarily storing data that is processed by or transmitted through the data storage controller.
The data storage controller may comprise and expansion bus card that operatively attaches to a host system bus. The data storage controller may be embedded within a motherboard of the host system.
In another aspect, the data storage controller comprises a DMA (direct memory access) controller that controls access of the cache memory device by the DCE, the first interface and the second interface. The DMA controller is implemented by the programmable logic device or the DSP or by both.
In yet another aspect, the data storage controller comprises a bandwidth allocation controller that controls and allocates bandwidth access to the cache memory device by the DCE, and first and second interfaces based on either the anticipated or actual compression ratio of the DCE.
In another aspect, the DSP of the data storage controller comprises external Input/Output ports that may be used for transmitting data (compressed or uncompressed) from the data storage to a remote location and for receiving data (compressed or uncompressed) transmitted from a remote location. Data transmitted from a remote system can be processed by the data storage controller on behalf of the remote system and transmitted back to the remote system via the DSP I/O ports. Further, the I/O ports of the DSP may be dynamically reconfigured for programming the programmable logic device during initialization of the data storage controller.
In another aspect, the DSP of the data storage controller comprises a dedicated bus, operatively connected to the programmable logic device, for programming programmable logic device during initialization of the data storage controller.
In yet another aspect, the DSP of the data storage controller is configured to sense the host system environment upon initialization of the data storage controller and select the program code stored in the non-volatile memory device to instantiate a first and second interface that corresponds to the host system environment.
In another aspect, the data storage controller is programmed to preload boot data into local cache memory in advance of receiving and servicing requests by the host system for the boot data during system boot-up.
The present invention is realized due to recent improvements in processing speed, inclusive of dedicated analog and digital hardware circuits, central processing units, (and any hybrid combinations thereof), that, coupled with advanced data compression and decompression algorithms are enabling of ultra high bandwidth data compression and decompression methods that enable improved data storage and retrieval bandwidth