Patent Application of Liam Scanlan and Cory Bear METHOD FOR EXTRACTING AND STORING HISTORICAL RECORDS OF DATA BACKUP ACTIVITY FROM A PLURALITY OF BACKUP DEVICES
Patent Application of Liam Scanlan and Cory Bear METHOD FOR VISUALIZING DATA BACKUP ACTIVITY FROM A PLURALITY OF BACKUP DEVICES
The present invention is related generally to electronic/software data backup and more particularly to simultaneous and seamless examination of such backup activity performed across a plurality of backup software devices.
No federally sponsored research was involved in the creation of this invention.
No microfiche has been submitted with this patent application.
Most data backup software devices in use today provide for the repeated, regular electronic transfer, over a network, of data from the point at which it is in regular use to a medium, such a magnetic tape, for the purposes of securing a fall-back situation should damage occur to the original data.
Included in the list of such software programs, are programs that work on relatively small amounts of data, sometimes on a one-computer-to-one-tape-drive basis, and others that work on very large amounts of data, with banks of tape drives that are used to back up data from potentially thousands of computers connected to a network.
Mostly, these data backup software products use what is known as a xe2x80x9cclient/serverxe2x80x9d model. In the context of data backup, this means that there is one computer (the xe2x80x9cserverxe2x80x9d) that controls and manages the actual data backup activity, and other computers (the xe2x80x9cclientsxe2x80x9d) that get backed up by the xe2x80x9cserverxe2x80x9d. In this scenario, the data backup tape drives are usually connected directly to the backup xe2x80x9cserverxe2x80x9d. There is also usually more than one backup server, each of which is responsible for the backup of data of numerous clients.
A central function of the activity of data backup is the ability to xe2x80x9crestorexe2x80x9d data in the case of damage to the data that is in use. The backup server computer usually controls this restore process. Understandably, the time it takes to recover data, and the confidence that the data recovery process will succeed, are two critical aspects of the data backup and restore function as a whole.
Disk drive capacities and data volumes, and consequently the volumes of data to be backed up, have historically been increasing at a greater rate than the backup server speed, tape drive capacity and network bandwidth are increasing to handle it.
Accordingly, new technologies have been added to help. Such new technologies include fiber-optic cables (for fast data transfer across the network), faster chips, tape drives that handle more tapes, faster tape drives, xe2x80x9cStorage Area Networksxe2x80x9d and so on.
The activity of data backup has become more and more critical, as the importance of the data has increased. At the advent of the desktop xe2x80x9crevolutionxe2x80x9d, that is, when people first started using personal computers (PCs), almost every piece of important data was still stored on one, single computer, possibly a mainframe or a minicomputer. As the numbers and types of computers proliferated, particularly on the desktop, and the purpose for which these desktops were now being used, making the data on such computers increasingly valuable, many different products designed to backup data were created and put into the marketplace. Now, there are some 50 or more data backup products in use by organizations and private individuals.
Generally, but not always, such data backup software devices (products) have a reputation for being difficult to use. When there is an exception to this, the data backup software product often has other, perhaps related, limitations (e.g. the amount of data is can back up is small).
Not all data backup software devices perform the same function. Thus, it is frequently necessary to have two or more different types of data backup software programs in use within the same organization, especially in large organizations. Anecdotally, one company has as many as 17 different data backup software devices in use somewhere in their organization. This is referred to as fragmentation.
In large organizations, is has become necessary to hire expensive expertise to manage such large data backup and restore services. The more varied their data backup devices, the more expensive this becomes. Also, for large organizations, it has become increasingly likely that scheduled data backup activities will fail. Because of the extra complexity of running a variety of data backup software devices, and because of the sheer number of data backup activities that need to take place regularly, failed data backups often go unnoticed in a sea of less-relevant data backup information.
An additional problem is that beyond a certain number of hours, perhaps minutes, if identifying a failed data backup takes too long, then it often becomes too late for meaningful corrective action to be taken. As a result, large organizations often take an expensive xe2x80x9cbest guessxe2x80x9d approach. Anecdotally, the level of confidence that large organizations live with regarding data backup success is said to be about 80%. In other words, it is expected that no more that 4 out of 5 data backups will be successful. Almost every large organization will relate experiences where data was lost because they mistakenly believed the data was been backed up.
Also, a problem that is of increasing significance is the fact that there is currently no practicable means of charging 3rd parties for data backup services rendered via most backup products, even though the sharp increase in organizations providing that service for pay is expected to continue.
Accordingly, what is needed is a means for quickly sifting through large numbers of data backup activities, in particular, across the activity of a plurality of data backup software programs, and to provide a uniform view of the those data backup activities, regardless of what data backup software product actually performed, or failed to perform, each backup.
Some backup products include reporting functionality that allows the administrative user to view historical records of backup activity. However, as each data backup product uses a notation that dissimilar from other data backup products, it is difficult or impossible to cross-reference or consolidate historical records of backup activity across a plurality of data backup products.
The consolidation of historical records of backup errors across a plurality of backup products is possible, to some extent, by using a general-purpose network management framework, like Computer Associates Unicenter. This type of product is typically designed to use the simple network management protocol (SNMP) to obtain errors from an arbitrary variety of computer programs across a network including data backup products. However, while general-purpose network management frameworks can consolidate errors, they do not provide a method to obtain historical records of data backup activity across a plurality of data backup products.
Accordingly, what is needed is a method for obtaining from a variety of different data backup software devices, an historical record of data backup activity suitable for the cross-referencing, consolidation and comparison of this data. An important aspect of this method is that it must include alingua franca or common notation for expressing an historical record of backup activity (and errors) and a convenient method or framework for combining software components that translate data obtained from a plurality of application programming interfaces (APIs) to the common notation. Another important aspect of this method is that it must be extensible so that it can be made to support additional backup software devices as the need arises by adding new modules but without requiring modification of the invention. The invention described in this document fulfils this requirements. It can then be used as an important component of software that analyses backup success and failures, generates billing reports and for other applications.
In accordance with the present invention an extensible software component with the ability to interface to a plurality of backup engines for the purpose of obtaining historical records of data backup activity in proprietary notations and translating same to a canonical backup activity log and canonical backup error log. Those aspects of this ability that are entirely specific to a particular backup engine are derived from the use, by the invention, of a backup engine plug-in. Therefore, the interface between the invention and a backup engine plug-in is also described in this document.