1. Field of the Invention
The present invention relates generally to spreadsheet software, and more particularly, to a method and system for detecting and selectively correcting cell reference errors in a spreadsheet.
2. Description of the Background Art
In recent years, spreadsheet programs such as Corel.RTM. Quattro.RTM. Pro have become ubiquitous in the office and are gaining widespread popularity in the home. Spreadsheet programs are computer software applications used for creating and manipulating spreadsheets. In general, spreadsheets are tables of values arranged in rows and columns, which are used for a variety of applications from financial accounting to scientific research. Traditionally, spreadsheets were kept on paper, but maintaining a spreadsheet by hand was time-consuming and prone to human error. With the advent of computer-implemented spreadsheets, however, spreadsheets are easier to maintain, and benefit from the speed and accuracy of electronic recalculation.
In a spreadsheet program, each value is stored in a virtual box called a "cell," which contains a single piece of data. Each cell has a unique name (often called an "address") derived from its row and column in the spreadsheet. Typically, the column is identified by a letter, and the row is identified by a number. For example, the address "C17" refers to the cell at column C and row 17. Ranges of cells may also be defined by identifying a starting and an ending cell, and concatenating the two references with a range symbol. For example, the range of cells comprising C17 through C20 is defined by the expression "C17 . . . C20."
Generally, the data stored in a cell is one of three types--numbers, text, or formulas. A formula defines a relationship between cells, usually in the form of a mathematical function or expression. The output of the formula is a value that is displayed in the spreadsheet, although the actual contents of the cell is the formula itself.
An example of a basic spreadsheet using a formula is one that calculates home mortgage payments. Normally, this spreadsheet includes cells for storing the cost of the home, the down payment, the mortgage rate, the mortgage term, and the monthly payment. The relationship between the cells is defined by a formula, which may be a pre-defined function as in some spreadsheet programs such as Corel.RTM. Quattro.RTM. Pro, or it may be a user-specified formula. After the user inserts values into the cells corresponding to the variables of the function, the spreadsheet automatically calculates and displays the mortgage payment.
Formulas refer to other cells by means of cell addresses, also known as cell references. Cell addresses are typically expressed in one of three ways: absolute, relative, or mixed. In general, the difference between these addressing modes is only relevant when the cell containing the reference is copied or moved. However, the difference is often a source of confusion for the uninitiated, as will be seen below.
In absolute addressing, a reference does not change when the cell containing the reference is copied. An absolute address is normally identified by a "$" symbol that precedes both the row and the column component the address. For example, cell "C20" may contain an absolute reference to cell "$C$17." If the contents of cell "C20" is subsequently copied into cell "C21," the reference is copied, unchanged, into cell "C21," so that both cell "C20" and cell "C21" contain the absolute reference to cell "$C$17."
However, sometimes it is advantageous to allow a reference to change when the cell containing it is copied. For example, a spreadsheet may include a number of columns corresponding to monthly sales figures. At the bottom of the first column, the user may enter a summation formula in order to add the cells in the column. However, instead of retyping the formula for each column, the user may prefer to simply copy the formula to the remaining columns. With absolute addressing, the results would not reflect the user's intent, since each copied formula would reference the cells in the first column rather than referencing the cells in the column corresponding to the location of the copied formula. Thus, another addressing mode, called relative addressing, has been developed to resolve this problem.
In relative addressing, the reference points to a position in the spreadsheet relative to the cell containing the reference, rather than to the absolute address. Internally, the reference is stored in terms of the row and column offset from the cell containing the reference (e.g., three rows up and two columns to the right). Thus, the cell pointed to by a relative reference changes when the containing cell is copied or moved. In most spreadsheet programs, relative addressing is the default mode. For example, if cell "C20" contains a reference to cell "C17," the reference internally points to any cell that is three cells above the one containing the reference. As a result, if cell "C20" is copied into cell "C21," cell "C21" will then refer to cell "C18," and not cell "C17," as was the case in absolute mode.
In mixed addressing, a cell reference contains both absolute and relative address components. For example, an address "$C17" is absolute for the column component, but relative for the row component. Thus, if the cell containing the reference is later copied, the column will not change, whereas the row will change if appropriate. Conversely, the address "C$17" is relative for the column, but absolute for the row. If the cell is later copied, the column will change if appropriate, whereas the row will not.
Many users are not familiar with the different addressing modes that exist in virtually every spreadsheet program. This is particularly true for many home users. As a result, inexperienced users sometimes rely on the default (relative) addressing mode when absolute addressing is required. Such mistakes often result in unexpected results or errors that are difficult for ordinary users to find and correct.
One example of the problem is shown below in connection with FIGS. 3A-B. FIG. 3A illustrates a simple spreadsheet 300 for calculating a salary, which includes a plurality of cells 301 for storing a commission rate, a base salary, and three months of sales results. Because the present description is in the context of cell copying operations, the spreadsheet 300 includes a source cell 302 and a destination cell 304. Referring to FIG. 3A, a user initially entered into the source cell 302 the formula "+B31+B30*B29," which adds the base salary to the product of the commission rate and the January sales results. The formula yields the expected monthly salary of $2,500.
Referring now to FIG. 3B, the user next copied the formula in the source cell 302 into the destination cell 304, instead of retyping the formula. However, the calculated result in the destination cell 304 is incorrect, at least from the standpoint of the user's expectations. Because the original formula the source cell 302 used relative addresses, as is typically the default mode, the cell references were changed when the formula was copied. Thus, the new formula in the destination cell 304 is "+C31+C32*C29." However, cell "C29" is empty, which the spreadsheet program interprets as a zero. Thus, the formula produces an unexpected and incorrect result. The problem lies in the fact that the user assumed that "B29" was an absolute reference, when it was actually relative.
Another example of the problem is shown in FIGS. 4A-B, which illustrate a spreadsheet 300 for calculating a monthly loan payment for a variety of interest rates. Referring first to FIG. 4A, the user entered into the source cell 302 a pre-defined payment function "@PMT(B4,B7/12,B5*12)" in order to calculate the payment at 7.00% interest. Next, in FIG. 4B, the user copied the formula into the destination cell 304, intending to calculate the payment at 7.25% interest. However, as in the preceding example, the relative cell references were changed during the copy such that the formula in the destination cell 304 became "@PMT(C4,C7/12,C5*12)." As before, some input variables of the new function, specifically "C4" and "C5", refer to empty cells. Here, the payment function generated an error ("ERR") because it could not tolerate having a zero for certain input values.
The problem shown in FIG. 3B is particularly insidious because inexperienced users may not detect the error before relying on inaccurate calculations. However, even if the error is obvious, as illustrated in FIG. 4B, an inexperienced user may not know how to solve the problem. What is needed, therefore, is a method and system for detecting cell reference errors caused by copying cells containing relative address references. What is also needed is a method and system for notifying a user as to potential cell reference errors and presenting the user with options for correcting those errors.