1. Field of the Invention
The present invention relates generally to methods, software and systems for measuring and valuing the quality of information and data, where such measurements and values are made and processed by implementing objectively defined, measurable, comparable and repeatable dimensions using software and complex computers.
2. Description of Related Art
Commercial companies and government agencies purchase, generate, collect and acquire large amounts of data and information, both externally and internally, to run their businesses or governmental functions. This data is used to manufacture an information product, designed to improve the outcome of a process such as the decision to grant credit, issue an insurance policy or the course of treatment for a patient. The data and information relating to individuals upon which multi—million dollar business decisions rely, have no data quality dimensions or metrics.
Today companies and government agencies invest significant amounts in data and information acquisition, generation, collection, aggregation, storage and use. The companies and government agencies make decisions, incur expenses, generate revenue, make policy, engage in activities and regulation all based on their data and information.
Business managers, decision makers and business intelligence modelers rely on automated systems to strain and sieve through oceans of data to find just the right combination of data elements from multiple sources to make decisions. The data elements they extract and use may be wrong or incomplete, or worse yet, the information may be correct but not timely or not have enough coverage from which to glean valuable decisions. Companies which use large amounts of data in their business processes do not presently know the absolute and relative value of their data assets or the economic life of such assets, as measured and scored by the implementation of data metrics. These same companies do not presently know how to best use their data assets.
Further, the present state of industry for information and data quality is processing the information for entity resolution, i.e. identifying one individual by various names, cleansing, deduplication, integration and standardization and information elements. While these functions are appropriate, they do not include any form of data assurance management.
Surveys have revealed that data quality is considered important to business, and data should be treated as a strategic asset. These same companies, however, rarely have a data optimizing or data governance strategy. Further, there are no systems for predicting and systematically verifying data assurance in the marketplace. Instead, data quality software tool vendors typically focus on name de-duplication and standardization of addresses with USPS standards. Thus, currently every piece of this data or information is treated with equal weight and value, with no distinction among the data and its quality for such metrics as to its accuracy, relevance, timeliness, completeness, coverage or provenance.
There exists a need for automated systems and methods for measuring and scoring dimension of data with a result of metrics. There also exists a need to understand and evaluate the true value of data in relation to business applications, to maximize potential and create and measure data value. These data assurance needs include: (i) the relative, compared contribution of data sources; (2) the absolute contribution of data sources to the data product being created; (3) the score or standardized measure of value of a data source in its application, data class or data use; (4) the optimization of data sources in the optimal order of functional use such as the cascading of data sources in a priority of use order to obtain the best or optimal sequential use of the group of data sources; and (5) the determination of the intangible asset value of the data investment of a company.