It is common for commercial computer programs to be developed by a team of programmers over a relatively long period of time. During development, the current version of the program source code is generally stored in a code repository from which files or other portions can be extracted for modification and testing. It will frequently arise, in large teams, that code is modified by a person other than the original author.
Source code consists of functional code in a given programming language but also includes non-functional formatting aspects which are essentially a matter of author preference. Different developers have different code formatting preferences, for example on which lines curly braces “{” and “}” should be placed, or how many tabs or spaces should be used for indentation of new lines and how many blank lines should be left between lines of functional code.
Such options may run into the hundreds and various development platforms, such as Eclipse, from the Eclipse Foundation, or Visual Studio (trademark of Microsoft Corporation) provide auto formatting tools that can be configured to a user's formatting preferences. Provided the code is syntactically correct, invoking the tool formats the code in the current file to the configuration specified by the user.
If a programmer opens a file formatted in a different style to their own, edits it and then uses a formatting tool different from that used by the original author, the file can look very different, because different formatting options have been employed. The original author, or anyone else looking at the change history of a file, e.g. to determine how a particular fix was made or what things have changed recently, may then have difficulty monitoring the changes made, which can be a cause of inefficiency and frustration.
Also, using a so called “diff” tool to highlight differences between successive instances of the same code will highlight not only the functional changes but also the formatting changes which will likely be large in number and obscure the significant functional changes. For this reason and generally, it is good practice to attempt to minimise the number of changes made during development of a large computer program.
Of course, for a completely newly written program, one solution is to enforce a given set of coding standards so that everyone uses the same formatting settings, thus ensuring consistency of file format across the code base. This is not possible where a long lived product includes code from previous versions (“legacy code”), which may be written in a number of different programming languages and in a number of formats or styles.
Various approaches to the problems of mixed formats are known in the prior art.
In US Patent Application Publication 2004/0122791 A1 for a “Method and System for Automated Source Code Formatting” (Sea et al. assigned to Hewlett-Packard Company), source code files may be extracted from a store for editing. In the store, the files are stored in a standard format. Once extracted for editing, however, they are reformatted to a programmer's preferred format. After editing they are “re-reformatted” to the standard format and stored back in the repository.
The prior art also includes a source code formatter, known as “Polystyle” (available on the Internet from polystyle.com) which reformats source code to a selected style. The selected style may be an existing style or may be determined from personal examples of the programmer's code. The style for reformatting is then chosen by the programmer.
None of the above prior art offers an automated solution to the problem of how best to format edited versions of earlier developed code in such a way as to minimise formatting changes and hence to facilitate their diagnosis by the originator.