The invention relates generally to configuration management systems and methods and, more particularly, to systems and methods for reconstructing prior versions of software configurations created in the context of parallel development.
A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. The following notice applies to the software and data as described below and in the drawing hereto: Copyright (copyright) 1998, Microsoft Corporation, All Rights Reserved.
Many software and Web site applications are created in the context of parallel development, where multiple individuals create and modify numerous files produced by the development language, authoring tool, or application. These projects require processes for versioning source files and for managing changes to the files that comprise the project. Configuration management systems are commonly used for version control and managing changes to file content.
Some configuration management systems merely store copies of the project files at various times. For example, a configuration management system may store the project files for a product release. These complete content configuration copies are then retained in a database for later access, if needed.
One disadvantage to merely copying configurations is that, in a large-scale project, the configuration copies can take up large amounts of disk space. Consequently, a project administrator typically may store relatively few versions of the project (e.g., only particular releases). In some cases, however, it is desirable to have the configuration management system keep track of each project build, where a build could be made as frequently as daily, although builds could be made more or less frequently, as well. A configuration management system that merely copies the database each night would be highly impractical, given the amount of disk space and copying time that would be needed.
More advanced configuration management systems are capable of reconstructing previous project configurations, rather than simply referring to archived copies of specific project configurations. These systems keep track of changes made to the project files, and determine which versions of those files apply to the desired project configuration.
Prior art configuration management systems that can reconstruct previous project configurations typically operate on the individual file level by maintaining a version graph for each file of the project. When a new version of a file is created, the version graph is updated to reflect the new version.
Prior art FIG. 1 illustrates an example of version graphs for two files, file 1 and file 2, of a multiple file project in accordance with the prior art. As shown in the Figure, the project has been changed across three project configurations: Configuration A, Configuration B, and Configuration C. For ease of illustration, each version of a file is indicated by a circle 101, 102, 103, 104, 105, 106, 107, 151, 152, 153, 154, 155, 156, and 157, and the number within the circle indicates where that particular version fits within the sequence of file versions. Thus, changes were made to file 1 sequentially from 101-107, and circles 101-107 represent the first seven versions of file 1.
The labels xe2x80x9cConfiguration_xe2x80x9d indicate which file versions were created in the context of which project configuration. The version graph 100 for file 1 indicates that three file versions 101-103 include changes associated with Configuration A of the project. File versions 104-106 include changes associated with Configuration B, and file version 107 includes changes associated with Configuration C. Similarly, the version graph 150 for file 2 indicates that four file versions 151-153 and 155 include changes associated with Configuration A. File versions 154 and 156 include changes associated with Configuration B, and file version 157 includes changes associated with Configuration C.
At times, the system may identify certain points along each file""s version graph as being part of a specific release. These identifiers are known as labels or configurations. Although multiple versions might have been created in the context of a particular configuration, only those files that are specifically identified as part of a release can later be automatically assembled into a version of the configuration.
Although a version graph may be an effective way to track changes to a particular file, FIG. 1 illustrates that version graphs fail to establish relationships between the files. Some systems do attempt to manage a collection of related files, but they typically do so in an ad-hoc fashion that is complicated to manage and time consuming to implement. Consider a standard file-oriented configuration management system (such as the UNIX utility RCS), which essentially is a collection of tools that operate on individual files, controlling file access and updates and comparing previous versions. To access and modify a group of files controlled by the system, an individual would need to write a special batch file or specify wildcards on the command line. Thus, the process of accessing the file versions for a particular configuration is a manual one, requiring each individual to have an in-depth knowledge of the project file structure.
Another disadvantage to prior art systems is that, although many systems effectively manage content changes to files, many do not effectively manage namespace changes, such as renaming files and moving files from one drive or folder to another. This inability to effectively manage namespace changes occurs because the filename of each file is generally used as a primary identifier in prior art systems. Thus, if a filename is changed or the file is moved from one drive or folder to another, the configuration management system would behave as if the file was deleted, and a new file was added to the system. In general, no historical link would exist between the previous version of the file, and the renamed or moved file. Even in a system where a historical link would exist, it would not typically be treated as a xe2x80x9cfirst classxe2x80x9d change, with the same ability to merge, move, and apply the change as if it were a change to the file""s content. Thus, unless an individual knows the name or location of the original file, it would not be possible to trace back and find the original file. This inability of prior art systems to manage namespace changes is particularly problematic for Web site development projects, where namespaces are a primary element of the software system.
A few prior art configuration management systems are project-oriented. One such system is the xe2x80x9cMICROSOFTxe2x80x9d(copyright) xe2x80x9cVISUAL SOURCESAFExe2x80x9d(trademark) version control system. Using this system, when an individual retrieves, modifies, and checks a file back in, the system records information indicating that a change to the file content may have occurred. This information is stored in a project history file. The system also stores information in the history file each time a file is added, modified, shared, moved or deleted from a project. This historical record can be output as a report, and an individual can use the report to pinpoint bugs or to manually recreate previous versions of the project. In the context of VISUAL SOURCESAFE, however, there is limited support for projects that span more than one folder. In addition, in an ideal world, the process of recreating previous versions of a project would be automatic, and thus transparent to the individual.
Although prior art systems do provide version control for the actual files of a project, one other feature that is lacking from prior art systems is that many of these systems do not adequately version file properties. Thus, they provide no way of rebuilding properties associated with previous versions of files. An individual using a prior art system would need to know what properties applied to what files at what times, as well as the values of those properties. In some cases, this information may be stored by the system, but the individual would have to link the files and their properties manually. This lack of property support has forced prior art systems to express all their internal structure in terms of textual files, which is both inefficient and destroys the end user""s ability to query.
In an era where parallel development is extensively used to create complex software applications and Web sites, version control is further complicated. In the context of parallel development, an individual may check out and modify a configuration of the project while the main product configuration also is being modified. The individual may later want to resynchronize his or her configuration with the main product configuration, pulling new changes made to the main product configuration into his or her copy of the configuration. This resynchronization requires the use of differencing and merging techniques, which are implemented separately from prior art configuration management systems. In some cases, information regarding the relationships between files in different configurations is essential to accurately merge these files together. However, many other prior art systems do not maintain and provide this type of information automatically to the differencing and merging processes.
Another deficiency of prior art configuration management systems is that they do not adequately provide access control for multiple file versions spread across multiple configurations. During development of a large multi-file project, it may be desirable to allow some users to have certain access privileges for the files associated with some configurations, but different privileges for the files of other configurations. Current configuration management systems are unable to provide different access privileges for files of different configurations.
Essentially, what is needed is a configuration management system and method that enables an individual to automatically and efficiently recreate any prior project configuration exactly as it was at any time in the past, without consuming undue amounts of memory or disk space, and without being adversely affected by namespace changes. Specifically, what is needed is a configuration management system and method that can automatically trace back to all previous versions of files, even when those files have been moved or renamed. What is also needed is a configuration management system that is able automatically to reconstruct file properties as they were at any previous time. What is further needed is a configuration management system and method that provides access control for multiple file versions associated with various configurations of a project.
A method for providing configuration management for a multiple-file project creates a configuration by assigning a configuration identifier to the configuration.
Historical data is tracked that pertains to changes to files that are associated with the configuration. This is done by storing information that associates the identities of new file versions with the configuration identifier, where the new file versions resulted from changes to the files. The configuration can be reconstructed as of a desired date by determining, from the historical data and the configuration identifier, a set of file versions that comprise the configuration as of the desired date.
When a change has been made to a property of a file that is associated with the configuration, historical information describing the change is stored. The historical information includes a property identifier that identifies the property, a value of the property, and a file identifier that identifies the file.
Information describing the relationships between files of various configurations is also stored. When a request is received to incorporate changes from one configuration into another configuration, this information is modified to reflect new relationships between the files of those configurations.
Information is stored describing operations that copy a first version of a file from an originating configuration into a destination configuration. When a request is received to perform a merge operation that will merge, from the destination configuration into the originating configuration, a second version of the file that is a modified version of the first version, a determination is made, whether the first version of the file should be included in the merge operation. If the information indicates that the first version is to be included in the merge operation, the first version is included.
To construct a desired configuration of the project as of a desired time, the configuration identifier for the desired configuration is determined. Versions of the multiple files that are to be included in the desired configuration are identified as a set of the versions that are associated with the configuration identifier at the desired time. The set of versions is then assembled.
A determination is made whether a user has access privileges to file versions of the desired configuration by first determining whether a record for the user exists in a security cache. The security cache includes user capabilities information for users who have requested access to files of the project.
If no record exists for the user, the user capabilities information is determined from an access token for the user and security descriptors for the system, wherein the access token and security descriptors are stored in a security table, which is separate from the security cache. A new record is then added to the security cache that includes the user capabilities for the user. From the user capabilities information and from information describing all versions of all files managed by the system, a determination is made whether the user has the access privileges to the file versions of the desired configuration.
Compressed versions of files within a version store are automatically reconstituted by monitoring a number of requests for a full content version of a file that is stored as a compressed version in the version store. If the number of requests exceeds a threshold, the file is reconstituted to a full content version of the file. The full content version is stored in the version store.
Versions of files stored within a version store are automatically compressed by determining whether versions of a file that are earlier than a latest version are stored in a compressed state in the version store. If the versions are not stored in a compressed state, at least one of the versions is compressed and stored in the version store. In one embodiment, compression uses xe2x80x9closslessxe2x80x9d compression techniques. In another embodiment, compression is achieved by comparing xe2x80x9cdeltasxe2x80x9d of file changes from previous versions.
A computer-readable medium has computer-executable instructions for performing the above.
A configuration management system includes a processing unit, a system bus, and a computer-readable media. The processing unit and the computer-readable media are coupled through the system bus. The processing unit creates a configuration by assigning a configuration identifier to the configuration. The unit also tracks historical data pertaining to changes to files that are associated with the configuration by storing information associating the identities of new file versions with the configuration identifier. The unit reconstructs the configuration as of a desired date by determining, from the historical data and the configuration identifier, a set of file versions that comprise the configuration as of the desired date. The computer-readable media stores the configuration identifier, the historical data, and the set of file versions.