1. Field of the Invention
The present invention is generally related to persistent storage devices, and, more specifically, to a system and method for enabling the centralized storage and maintenance of persistent storage device data images.
2. Discussion of the Background
In general, the ability to store and access data is critical for computers. For example, when turned on, a computer (e.g., a personal computer (xe2x80x9cPCxe2x80x9d)) accesses, and prepares (or xe2x80x9cbootsxe2x80x9d) the operating system from its local persistent storage device (e.g., xe2x80x9chard diskxe2x80x9d). Once the booting is finished, the contents of the hard disk are accessible and available to the user. The contents of the hard disk (also referred to as the hard disk""s xe2x80x9cdisk imagexe2x80x9d or xe2x80x9cdata imagexe2x80x9d) define the user""s personalized environment: the operating system (such as Windows 98 SR-2, Linux, etc.), the software applications (such as word processors, spreadsheet programs, web browsers, etc.), the data files (such as documents, spreadsheets, images, or cookies), and any additional customization (such as whether a particular web browser, such as Netscape Navigator or Internet Explorer, is automatically launched when an HTML file is accessed).
A hard disk is but one example of a persistent storage device. A persistent storage device can be defined as follows:
(a) it is a physical device that is physically attached to a computer using a standard physical interface (e.g., a hard disk attached with an IDE cable). This physical interface provides the link between the persistent storage device and the computer. (If the persistent storage device under consideration is a hard disk, the physical interface is frequently called a connector and is typically attached to a hardware component on the computer called the disk adapter, which itself provides the logical link between the persistent storage device and the computer);
(b) it contains a local permanent medium (e.g., magnetic media) for storing a sequence of bits, (i.e., data), typically organized according to a particular file structure. The bits are collectively called the persistent storage device data image (PSDDI), or data image (DI) for short. When the persistent storage device is a hard disk, the persistent storage device data image will frequently be called a xe2x80x9cdisk image.xe2x80x9d Typically, the local permanent medium is capable of storing a large amount of data (e.g., more than 10 Megabytes);
(c) it has the ability to selectively read and write any part of the data image; and
(d) it allows the computer to which the device is attached to selectively read and write any part of the data image through a standard set of interface protocols.
The scope of persistent storage devices includes all hard disk drives implementing interfaces such as ST506/412, ESDI, SCSI, IDE, ATA, ATAPI, ATA-E and EIDE, read/write CD ROM drives, ZIP drives, JAZ drives, floppy drives and the like. In addition, the present invention applies to embedded systems"" persistent storage devices, such as, Flash, and DiskOnChip.
Any two xe2x80x9chardware-similarxe2x80x9d PCs having the same data image would appear the same to the user. In contrast, if a user""s data image is replaced by a significantly different data image, the user will most likely see an unfamiliar desktop displayed on the PC""s display screen. What would be even more disturbing and likely to make the PC unusable to the user, is the fact that the new data image would have different software and data files from the original data image. Thus, it is the data image that makes a user""s PC the user""s xe2x80x9cPersonal Computer,xe2x80x9d and it is the most valuable and essentially the only irreplaceable component of the PC.
The conventional PC is xe2x80x9cgovernedxe2x80x9d by the contents of its hard disk, and therefore the limits of the installed software become the limits of the user. Once the user""s needs change, or grow beyond the capabilities of the installed software, the user has to deal with upgrading or installing a new OS or application software, a costly, time consuming, and frequently aggravating process even for a professional. Moreover, in environments such as offices within large companies or firms, this problem is compounded because the hard drive on each individual PC needs to be accessed in order to perform an upgrade. In addition, such upgrades may cause some existing software not to work properly, in effect corrupting the previously stable data image.
There are several computer architecture models that attempt to solve the above problem. These architecture models and their respective disadvantages are described below.
Network Computer: A network computer (NC) is a lightweight computer with a simple built-in operating system. After booting, it connects to a remote computer for file system access. Software programs reside on the remote computer. Once invoked, they are downloaded to the NC where they execute. The applications are typically based on Java or JavaScript. The problems with an NC are that existing applications have to be reengineered for this platform, and an NC has limited capability to perform computing operations when not connected to the network. If the software on the NC is badly corrupted, it may not be able to boot or access the network and therefore the NC will not be functional. Thus well functioning local software is required for operation. Further, NCs have no notion of providing a remote image to a local computer transparently to the operating system executing on the local computer.
Thin Client: The local computer, termed the thin client, is used mainly for display and user input. Applications execute on a server and the thin client opens a window to the server to interact with the applications. For the thin client to work, a continuous connection from the thin client to the server is needed. A thin client is typically running on a standard computer; however, the thin client technology does not provide any means for remotely administering or upgrading of the computer""s software. In addition, thin client technology requires that the data files (such as Word documents) be manipulated on the server, which requires that they not be encrypted during such manipulation. Also, well functioning local software is required for operation. Thin clients are also operating system specific.
Remote booting and Disk-less computers: Some operating systems, such as Unix, MacOS and Windows 95 allow computers to boot from an image on a remote computer. This feature is typically used for disk-less computers. However, even if the computers have a disk drive or other persistent storage device, it is only used as a swap space (runtime operating system scratch space), and the contents do not persist across boot sessions. Remote booting and diskless computers do not work off line.
Remote File System Technologies: They allow mounting of a remote file system to a local computer (e.g., NFS). Remote file systems can be provided by a remote computer or by a remote network disk. These technologies allow a computer to access data and programs stored on remote server(s). However, system software built into the operating system is required. Remote file technologies do not allow remote administration of the computer. They also require functioning software on the computer. In addition, remote file system technologies do not work off line whilst the present invention does work off line.
Automatic file propagation: Software tools such as Unix""s rdist, allow files to be synchronized across networked computers; however, such tools are operating system and file system specific, and require a functioning operating system for them to work.
What is desired is a system and/or method that overcomes these and other disadvantages of conventional computers and computer architectures.
The present invention provides a persistent storage device data image management system, or data image management system (DIMS) for short, that is able to solve the above described problems that users encounter when upgrading and/or maintaining their computers.
According to the present invention, the DIMS completely de-couples a persistent storage device data image xe2x80x9cseenxe2x80x9d by the computer from a persistent storage device attached to the computer (also referred to as the local persistent storage device (LPSD)). The DIMS includes a local data image manager (LDIM), which is required to be installed (either by the manufacturer, the distributor, a user, or a technician) on the user""s computer, a remote data image manager (RDIM), and a remote persistent storage device (RPSD). The LDIM communicates with the RDIM through a direct communication link or through a communication network (e.g., a local area network, a wide are network, the Internet, the public switched telephone network, a wireless network, etc). The RDIM can store data on and retrieve data from the RPSD.
In an environment where an LDIM has been installed in a computer having a xe2x80x9clocalxe2x80x9d persistent storage device (LPSD), the DIMS allows for the storing of the LPSD""s data image on the RPSD, with the LPSD serving as a persistent, consistent cache of the data image. The data image stored on the RPSD is referred to as the xe2x80x9cmaster data imagexe2x80x9d and the data image cached on the LPSD is referred to as the xe2x80x9clocal data imagexe2x80x9d or xe2x80x9ccached data image.xe2x80x9d In general, there is no requirement that the LPSD and the RPSD be of the same type. For instance, the LPSD could be a DiskOnChip and the RPSD could be a hard disk. Also, the RPSD may be a specialized smart device rather than being installed in a general purpose computer.
The purpose of the LDIM is to imitate the LPSD. That is, the LDIM, from the computer""s perspective, appears exactly like the LPSD. More specifically, the LDIM functions to intercept and process requests that are intended to be received by the LPSD, which may not be in fact installed in the computer. Common are read/write requests specifying an address (for example, in the case where the LPSD includes a hard disk, the read/write requests specify a sector of the hard disk).
Upon intercepting a read request, which specifies an address, the LDIM is programmed to determine whether the portion of the cached data image that is stored at the specified address is up-to-date (i.e., whether the portion of the cached data image reflects the latest changes made to the master data image). In one embodiment, this feature is implemented by having the LDIM request a xe2x80x9cmodified-listxe2x80x9d from the RDIM each time the LDIM is powered on and (optionally) to have the RDIM provide to the LDIM updates to the modified list whenever a modification to the master data image occurs. The xe2x80x9cmodified-listxe2x80x9d is a list of all the xe2x80x9cpartsxe2x80x9d or xe2x80x9cportionsxe2x80x9d of the master data image that have been modified since the last time the LDIM was informed of modifications to the master data image. (For example, if the master data image is a data image from a hard disk, the list of parts could be a list of the disk""s sectors.) Thus, if the LDIM receives a read request specifying an address that is on the modified list, the LDIM will know that the portion of the cached data image stored at the specified address is not up-to-date.
If the LDIM determines that the cached data image has the most up to date version of the requested data, then the LDIM (1) retrieves the requested data from the LPSD by issuing a read request to the LPSD and (2) passes the retrieved data back to the component or device from which it received the request. If the cached data image does not have the most update version of the requested data, then it must be stored on the RPSD (i.e., the master data image). In this case, the LDIM transmits to the RDIM a read request message, which may include the address specified in the intercepted read request. Upon receiving the read request message, the RDIM locates and reads the requested data from the master data image stored on the RPSD and then transmits the data back to the LDIM.
Upon intercepting a write request, the LDIM may write the data to the LPSD, if there is one, and transmits the data to the RDIM so that the RDIM can update the master data image thereby ensuring that the master data image is up to date. The LDIM may either transmit the data to the RDIM substantially concurrently with writing the data to the LPSD or wait until some later time (e.g., if the computer is not currently connected to the network or if the network is heavily loaded).
On requests other than read or write request, such as PND (Program Non Data request for IDE hard disks), the LDIM returns a response as required by the standard protocol for communicating with the LPSD.
It is contemplated that in some embodiments there will be no LPSD. In this case, there is no cache as described above. Instead, all read/write requests for data that are received by the LDIM are transmitted to the RDIM. In the case of a read request, the RDIM retrieves the requested data and transmits the data back to the LDIM. In this manner, a user of the computer has access to his or her personalized data image even when the computer is not equipped with a local hard disk or other persistent storage device. It is also contemplated that to gain the greatest benefit from the invention the computer in which an LDIM is installed should, as often as is possible, be connected to a network so that the LDIM can communicate with an RDIM.
From now, and without limiting the scope of the invention, the invention and its benefits will be described with respect to the particular embodiment where the LPSD is a hard disk. Once the DIMS is in place, the user need not concern him or herself with the task of upgrading his or her operating systems, application programs, data files, etc., following setting the appropriate agreements with the organization in charge of managing the master data images on RPSDs. This is because software patches and upgrades can be first performed on the master data image by an experienced system administrator. After the system administrator performs the upgrade on the master data image, the DIMS transparently ensures that these patches and upgrades are propagated to the local hard disk, as described above. Similarly, as described above, the DIMS automatically backs up all data files that are stored on the local hard disk that have been modified. This is accomplished by the LDIM transmitting the modified files (or sectors) to the RDIM so that the RDIM can update the master data image. In this manner, the master data image is kept up to date.
Additionally, the DIMS can cache multiple master data images on the local hard disk. This is advantageous where the computer has more than one user and each user has his or her own personalized data image. The DIMS uses a standard coherent caching algorithm (such as described in the standard textbook: Almasi and Gottlieb, Parallel Computing, 2nd edition, 1994, the entire contents of which are incorporated herein by this reference.) and implementation to store the cached data images and maintain their coherency with the corresponding master data image. When the LDIM is unable to communicate with the RDIM, the computer in which it is installed can still operate to the extent that the required software and data is cached on the local on the local hard disk.
Preferably, the DIMS provides this functionality below the operating system and all of its components (including device drivers) and BIOS routines specific to the hardware of the computer. Thus, the DIMS is completely transparent to the operating system and all applications of the computer. This allows the DIMS to be independent of any operating system or software installed on the computer. At the same time, it has the ability to provide the computer with any type of operating system compatible with the computer""s hardware, software, and other data.
This enabling technology provides a rich gamut of functionality that completely changes the way computers are utilized. A user can use any xe2x80x9chardware compatiblexe2x80x9d computer as their xe2x80x9cPersonal Computer,xe2x80x9d as the new computer transparently, without the user""s intervention, obtains over a network only those parts of the user""s data image needed for the current execution, with the other parts following later. For instance if the user wants to start execution by editing a document and then at a later point in time create and send an e-mail, a word processor will be downloaded before the e-mail program. A user""s computer can be replaced by a new one with the ease of xe2x80x9cplug and play,xe2x80x9d retaining all the user""s desired previous software, data, and settings. The user is presented with practically unlimited disk space, as the size of the master data image is not constrained by the size of a local disk.
The software and data cached on the local disk provide instantaneously for the normal needs of the user, thus minimizing the network traffic between the location where the master copy is stored and an individual computer. As the user""s software does not execute remotely, data files are kept private through encryptionxe2x80x94which can be done even on the local hard disk, with LDIM encrypting and decrypting the data as needed.
The DIMS is easy to integrate into existing platforms, provides, for the first time, a complete automated management and administration capability in LANs and over the Internet, while maintaining fully the rich computer functionality. The DIMS creates the following benefits: increased user satisfaction and productivity, removal of the need for users to spend time on administration or wasting time waiting for somebody to repair the damage they may have caused to their local disk contents, and in general tremendous savings in both the explicit and implicit parts of total cost of ownership.
In today""s computers, to access a hard disk, the following sequence of steps are performed:
1. An application wishing to read or write a file issues a request to an operating system API for such action.
2. The operating system checks if the request can be serviced from its file cache, if such cache is maintained.
3. On a miss, or write through, the operating system directs the request to an appropriate device driver for the physical device to which the request was made.
4. Optionally, the device driver may issue the request to the computer""s BIOS.
5. The device driver (or BIOS) issues the request to a disk adapter, which is a component typically installed on the motherboard of the PC.
6. The disk adapter uses a physical connection to send the request to the controller of the local hard disk.
It is an object of the present invention to implement the LDIM to intercept the request from the disk adapter so that the controller of the local persistent storage device will not receive it. Alternatively, it is an object of the invention to implement the LDIM to intercept the request from the device driver (or BIOS) so that the disk adapter does not receive it.
There are a number of ways in which this interception can be done including system management mode, a PC card, building it with new chips on the motherboard or the persistent storage device, or building the functionality into the adapter chip. The Alternative Embodiments section of this document elaborates on these embodiments.
It is another object of the present invention to allow the DIMS to encrypt writes and decrypt reads using a password, a pass-phrase, or a key supplied by system software or the user. As commonly done, the key itself could be kept encrypted and only be decrypted after a password or a pass-phrase are supplied. The obvious advantage of encryption is to secure the local data, the remote data, and the data transferred over the network. Furthermore, this functionality is transparent to the user, the operating system, and software applications.
It is another object of the present invention to allow the DIMS to work with any operating system presently available on the market (including DOS, Windows 98/NT/2000/ME, Unix, Novell, OS/2, BeOS) or any future operating system that supports any of the standard persistent storage device interfaces.
It is another object of the present invention to allow the DIMS to work with any standard hard drive interface.
It is another object of the present invention to make the above functionalities available for any existing or future computer hardware that uses any of the current or future persistent storage device interfaces.
It is another object of the present invention to provide a transparent mechanism for upgrading software and hardware drivers by means of pulling changes as needed when the computer is connected to the network. This is as opposed to current operating system specific push mechanisms, such as Unix""s rdist.
It is another object of the present invention to provide a mechanism for allowing users to roam from computer to computer and receive their personal operating system, personalization customizations, applications, and data files, no matter what operating system, applications, and data files were previously being used on that computer.
It is another object of the present invention to provide a mechanism allowing the DIMS to transparently present a large amount of storage space (bound only by the addressing size limitations), regardless of the amount of space available on the local physical persistent storage device.
Still other objects and advantages of the invention will in part be obvious and will in part be apparent from the specification.
Further features and advantages of the present invention, as well as the structure and operation of various embodiments of the present invention, are described in detail below with reference to the accompanying drawings.