1. Field of the Invention
The invention relates generally to the field of data processing.
2. Background Information
Currently there are thousands upon thousands of software programs installed in millions of computers that cannot transfer meaning from one to the other. For example, large companies with many branches or subsidiaries often find that the accounting or operating software programs used by one division or subsidiary is not compatible with the software used by other divisions or subsidiaries or the central corporate programs. This requires substantial conversions of data and often results in a great deal of data reentry along with the attendant costs and data integrity problems that attend data entry.
Because of the great variety of programs, operating systems and software standards currently used by software developers there is a great deal of incompatibility between suppliers and their customers. This also requires substantial conversions of data and often results in a great deal of data reentry and its implications. The unstructured and undefined nature of the current computer software environment imposes great burdens and expense on regulatory organizations such as the SEC, FDIC, Federal and State tax authorities, banks, etc. and the companies reporting to them.
To overcome this problem many standards organizations have been formed and are being formed to establish defined input/output vocabularies for use with the XML (eXtensible Markup Language) file format. XBRL (extensible Business Reporting Language) is one of the XML language formats being developed. It is expected to become a global standard for financial reporting. Throughout this disclosure we will use XBRL as the example of an XML language. It is not intended to limit the invention to XBRL or XML languages. We find many similarities for the Semantic Web where information Labels are used to facilitate computers talking to computers making decisions and taking action as a result of the communication. Other standards already exist and more will be developed that will benefit by the basic theory of this invention.
Virtually none of the existing software applications can automatically or semi-automatically convert conventional documents or data into outputs tagged with the standardized Information Labels called for by XML or other standards committees. In most cases the standards themselves are still in development. In order for XML and other data dictionaries or business vocabularies to take root, it is required that existing applications and data be associated or tagged with these standard vocabularies. This harsh reality will long delay the widespread use of these standards because it will take years for companies to migrate to new software products that are designed to output the appropriate Information Labels. In some cases that may never happen because it is virtually impossible to replace legacy software systems. For example, retrofitting all the accounting software in current use would be a very complex task that could not be accomplished in any short-term.
The recognized practical approach to standardizing the meaning of data is to attach defined Information Labels to the information being conveyed. In this way the meaning of the data can be determined by reviewing the definition of the label. It also means that computers can recognize the “meaning” of the tagged information and act on it based on that meaning. For example, data with the same “tag” can be added or compared without fear of adding or comparing apples and oranges.
Taxonomies and their extensions are used to define the Information Labels. For example in a financial report, the label <Sales> followed by a numerical value indicates that the numerical value relates to company's Sales. <Cost of Goods Sold> followed by a numerical value indicates that the value represents the company's Cost of Goods Sold. Since Gross Profit is Sales minus Cost of Goods Sold, computers could access 3rd party reports that show these values and easily calculate the Gross Profit with a simple rule that says <Sales><minus><Cost of Goods Sold>=<Gross Profit>.
Because not all companies use the same terminology, the taxonomies used by standards organizations also include synonyms and alternative phrases that have the same meaning. For example synonyms for Sales could include “Revenues” or “Fees”. Cost of Goods Sold might be “Cost of Goods” or “Cost of Sales”. The Information Labels can also carry information regarding the organizational authority that defined the label. If the taxonomy were authored by the US Securities & Exchange Commission the labels based on that taxonomy might be identified as USSEC, and so on.
Accordingly, there is a need for methods and mechanisms to accurately and efficiently transform data into XML, and in particular XBRL, compliant formats. The transformation would include, for example, adding appropriate labels to the data as defined in relevant XBRL taxonomies. There is also a need for methods and mechanisms to automate entry of XML and XBRL compliant data into non-XML or non-XBRL compliant programs or applications.
XBRL Essentials, authored by Charles Hoffman and Carolyn Strand, copyright 2001 by XBRL Solutions, Inc., ISBN 0-87051-353-2, is hereby incorporated by reference.