The present invention relates generally to methods and systems for automatically enabling the traceability of numerical, alphabetical, alphanumeric, character, or string entities such as critical constants or key calculations, equations, functions, and procedures, and more specifically to methods and systems for determining the provenance of such entities within a document, a worksheet, a spreadsheet, a table, a data file, a media file, or a software program.
Individuals engaged in scientific and engineering activities frequently employ critical values or constants derived from the principles of mechanics, electromagnetism, chemistry, and other disciplines to perform key calculations. Scientists and engineers in small to large organizations have traditionally performed such calculations using a variety of tools ranging from handheld calculators and spreadsheet programs to sophisticated calculation systems such as MATHCAD™, MATLAB™, and MATHEMATICA™ calculation systems. In such organizations, scientists, engineers, and/or management personnel are typically responsible for keeping track of the many software programs, worksheets, and data files required to drive these calculation tools, and for documenting and auditing the various assumptions, procedures, and results of the calculations performed using these tools. This situation has led to an ever increasing need for improved techniques and systems for managing audits of critical information in scientific and engineering organizations.
Specifically, the auditing of information used by scientists and engineers typically involves the concept of traceability, e.g., the ability to trace back from a particular critical value or calculation result to the initial assumptions and data underlying that value or result. The traceability of critical information enables scientists and engineers to determine whether a group of calculations employs the same approved value for a given parameter, and to determine which calculations depend either directly or indirectly on that value or parameter. The traceability of critical information also allows scientists and engineers to verify, validate, and authenticate the information, thereby providing such critical information with increased levels of integrity.
In recent years, the need to manage the traceability of critical information has been exacerbated as many scientific and engineering organizations have evolved into highly collaborative environments, in which large numbers of scientists and engineers interact as members of centralized or de-centralized project teams. Such environments are typically characterized by increased amounts of data, calculation, and resource sharing among the scientists and engineers, who may be widely dispersed in various geographic locations, and who may rely almost exclusively on public and/or private telecommunications networks to access information and resources and to communicate with management and colleagues. Having the ability to trace information in such highly collaborative environments is needed to assure that scientists and engineers trust the critical information provided to them by the various project teams, and to assure that the scientists and engineers understand how that information has been generated, modified, and used throughout the organization.
Conventional information management systems are generally capable of providing information relating to, e.g., the origin and history of data, how certain data analyses are performed, and what analysis results are obtained, by capturing annotations about this information. For example, a conventional information management system typically allows a user to generate annotations for selected subdivisions in a file or database, to convert the annotations to a structured form, and to store the annotations in that form along with connections to the corresponding subdivisions. The user can then access the various file/database subdivisions by providing the system with a query in a natural language or structured format, thereby causing the system to match the query against the stored annotations and to retrieve the file/database subdivisions connected to the matched annotations.
The conventional information management system described above has drawbacks, however, in that it is generally incapable of managing the traceability of information without requiring significant manual effort on the part of the user. Although the conventional information management system provides scientists and engineers with a way of capturing and storing annotations relating to, e.g., the origin and history of data, such annotations must normally be manually entered into the system and are therefore subject to omission, mistake, and falsification. Further, one or more documents containing such data may be incapable of being identified unambiguously, or may be misidentified, e.g., in the event multiple versions of the document exist. Because user-generated annotations often fail to provide any meaningful verification or validation of data, they are generally incapable of providing scientists and engineers with a high level of confidence in the integrity of their critical information.
It would therefore be desirable to have an improved method and system for managing the traceability of information such as critical constants or key calculations, equations, functions, and procedures in scientific and engineering organizations. Such a method and system would be capable of managing the traceability of critical information automatically, without requiring significant manual effort of a system user. It would also be desirable to have a method and system that automatically updates and propagates data relating to the traceability of critical information as that information is copied, modified, and/or re-used throughout a scientific or engineering organization.