This invention relates generally to testing data retrieval systems and, more particularly, relates to testing data retrieval systems using verification codes embedded within a data set.
Testing a data retrieval system involves generating test data and submitting a preselected set of test queries. Present testing schemes verify the accuracy of query results by consulting an xe2x80x9coracle.xe2x80x9d An oracle is a reference known to the tester to be correct. If stored electronically, oracle data is formatted so that a test program can access it. Every time an entry in the test data set changes, the tester must update the oracle accordingly. If the oracle is damaged or destroyed, a new one has to be created from the original test data set, otherwise the test data set must be discarded. If the test data set is large and made up of random characters, as is often the case, creating a new oracle is nearly impossible, since testers won""t be able to distinguish good data from bad. Thus, the use of oracles in data retrieval system testing is unwieldy and time consuming.
The present invention provides an apparatus and method for testing a data retrieval system that eliminates the need for an oracle. Specifically, the invention embeds verification codes into a test data set and interprets the codes after they have been retrieved along with data entries by a data retrieval system. The verification codes are made up of characters that the data retrieval system can process. For example, if the data retrieval system can only process alphanumeric characters, then the codes are made up of alphanumeric characters. The invention also performs a checksum, Cyclical Redundancy Check (CRC), or similar algorithm on retrieved data to verify accuracy.
The invention performs up to five levels of verification. These levels correspond to five basic questions asked by testers when gauging the success of a test data query: (1) Did the system retrieve a data set and is it the correct data set? (2) Is this data valid, or is it data that should have been deleted or updated? (3) Is this the kind of data that was supposed to be retrieved? (4) Was the data retrieved from the correct record? (5) Is the data accurate?
The invention need not use all of these verification levels in any particular test. The verification levels may be implemented in combinations, depending on the testing technique used and desired level of certainty. The invention uses embedded codes for level 1-4 verification, while performing a checksum, CRC, or similar algorithm on the actual data values for level 5 verification. Test queries according to the invention are structured so that the data retrieval system retrieves the corresponding codes along with the data.