The present invention relates in general to data backups and, in particular, to system and method for performing data backups with a stochastic scheduler in a distributed computing environment.
Maintaining complete and accurate data backups are an important part of modem information technology practices. Data backups must be performed on a regular basis to be most effective in ensuring that critical,data is not lost. Historically, data backups were performed predominantly using removable forms of storage media, such as streaming tapes, removable drum and disk drives, floppy diskettes, and similar forms of recordable media. Although data backups are still performed using these types of media, contemporary enterprise, workgroup and mobile computing practices present a broader set of needs than can be met by these traditional backup methodologies alone.
A typical workgroup computing environment contains individual client systems interconnected over a local area network or xe2x80x9cintranetworkxe2x80x9d located in a departmental setting. An enterprise computing environment generally contains individual intranetworks interconnected over a wide area network or xe2x80x9cinternetworkxe2x80x9d located in a company wide setting. Often, the internetworks includes geographically distributed resources which, when taken as a whole, comprise a unified set of loosely associated computers. The Internet is an example of a widely used public internetwork that can form part of an enterprise computing environment.
Data backups in these environments are generally directed to a centralized server that requests or polls individual clients for data backups on a regularly scheduled basis. However, backup schedules must be coordinated to occur during periods of inactivity for each client without causing client-against-client contention and differences between individual systems, such as hardware configurations and operating systems, must be addressed.
By comparison, mobile computing environments generally consist of conventional notebook and portable computers, but has increasingly included thin clients (diskless workstations), handheld personal data assistants, and other forms of portable and highly portable computing devices. Unlike non-mobile client systems, devices operating in a mobile computing environment, by definition, often function in isolation from other systems. Consequently, data backups must often be deferred until a connection with a data backup server is established.
Independent of computing environment, successfully performing and managing data backups can be difficult and time-consuming. For instance, contention between competing clients concurrently vying to perform data backups must be reconciled. As well, any client-installed software must be consistently configured and kept up-to-date. Furthermore, in the event of corruption, configuration parameters must be readily recoverable without requiring manual reinstallation at each client.
Several prior art data backup applications attempt to provide solutions to these concerns, including Net Backup, licensed by Veritas, Mountain View, Calif.; Networker, licensed by Legato, Sunnyvale, Calif.; and Quick Backup, licensed by McAfee, Santa Clara, Calif. To varying degrees, these solutions fail to satisfactorily address the data backup needs for enterprise, workgroup and mobile computing environments. For instance, the Net Backup application does not allow for automated backup of Registry-files. The Legato and Quick Backup applications are primarily directed to performing data backups in non-mobile computing environments. All of these solutions offer limited customizability, particularly in terms of offering flexible feature sets and in their ability to scale between different environments.
Therefore, there is a need for an approach to performing data backups in enterprise, workgroup and mobile computing environments that avoids conflicts in backup schedules, yet can accommodate the needs of infrequently connected clients. Preferably, such an approach offers a client-based solution that xe2x80x9cpushesxe2x80x9d data sets to a backup server.
There is a further need for a data backup approach that encapsulates configuration parameters in a monolithic application package, thereby facilitating self-configuration and updates.
The present invention provides a backup session application that includes a stochastic scheduler for generating a random start time. A data set including individual files with tracked file attributes is regularly copied into a backup data set stored on a centralized server. A data backup session only commences upon a successful connection and failed data backup sessions can be later resumed without duplication of previous file copying. Configuration parameters are encapsulated within the backup session application. Deleted files are maintained on the centralized server until a waiting period has expired.
An embodiment of the present invention is a system and method for performing a data backup with a stochastic scheduler in a distributed computing environment. A data set on a client is tracked. The data set is to be maintained with a backup data set on a centralized server within the distributed computing environment. A time period during which to initiate a data backup session for the tracked data set is selected. An instance of a backup session application on the client is periodically executed. The client attempts to initiate a connection with the centralized server beginning at a random start time during the selected time period. The client regularly reattempts the connection initiation following each failed connection initiation attempt. The tracked data set is selectively copied into the backup data set upon a successful connection initiation. Upon each successful data backup session for the tracked data set, a new random start time within the selected time slice for a next data backup session is generated.