1. Field of the Invention
This invention relates to the automation of product and vendor data entry where the product and vendor data is provided by one or more product suppliers and can potentially be provided in many different formats. In particular, this invention relates to methods and systems to automatically import, analyze, and categorize data from different sources and in many possible different formats, and to output the processed data to on-line business-to-business service providers or to any other recipient with an interest in the cleansed data.
2. Description of Related Art
Computer networks such as the Internet have facilitated the transfer of information among computer users. Business-to-business (xe2x80x9cB2Bxe2x80x9d) service providers, for example on-line shopping service providers, have taken advantage of the networking technologies to more efficiently and economically conduct their business transactions. The use of computers to transfer data, however, does not put an end to human intervention in the data transfer process.
Current on-line shopping web sites that offer a variety of products for sale, for example, face the formidable task of having to input and keep an inventory of the data related to the products they sell. Products are supplied by different sources which may also provide the information for the product being supplied.
Although the product data may be provided in electronic form, the on-line shopping service provider may have to enter the product information into their own databases manually. The reason for this is that there is no current data entry system that would convert product data formatted in any given manner to a standard format in which the data may be kept as part of the inventory database.
The data format problem is twofold. The first problem concerns the syntax of the data, which may differ according to the data supplier providing the data. A data supplier may, for example, use data transformation or conversion software such as Data Junction or InfoPump, both commercially available, to produce data with a given syntax or format.
The second problem, which is harder to solve than the first one, concerns the use of different terminology (semantics) by different product data suppliers in order to describe the same product. For example, one product supplier may use the term xe2x80x9cIBMxe2x80x9d while another may use xe2x80x9cInternational Business Machinesxe2x80x9d as part of the description of the same product. That is, the descriptions for the same product may vary widely. Like the data syntax problem, this problem is associated with data formatting.
Consequently, there is a need in the art for a system that automates the data entry operation for products supplied by different sources where the data may be found in as many different formats. Further, there is a need in the art for a system that maps the different representations of a product into a common set of product information while preserving the original data sent by the different suppliers for use as a reference.
Automated data importation methods and systems are disclosed. Specifically, such methods and systems enable an on-line shopping service provider to import product and vendor data being provided in different formats by different suppliers into a single product database. The on-line service provider acquires product and vendor data from a plurality of suppliers. Each acquired data set of a given type from a given supplier is compared to a product data set of the same type from the same supplier that had previously been acquired and that resides in the product database. The results of the comparison are reviewed as part of a data import preprocessing analysis.
The acquired supplier-specific data set is then converted to a standard data format before being further compared to a previously acquired data set stored in the standard format. The second comparison results in the categorization of data. The categorized data is used by different processes in order to automatically update the product database.
An object of the present invention is to provide methods and systems that enable the entry of data into a database system where the data is provided by different sources in different formats and where the entry takes place in an automated fashion. Further, it is another object of the invention to provide methods and systems that map different representations of a product included in different datasets into a common set of product information while maintaining the original datasets. Further, it is another object of the present invention to provide on-line shopping service providers with the ability to maintain a retail database containing product information that is up-to-date. Still further, it is another object of the present invention to achieve the objects stated above by minimizing human intervention in the importation of data into the retail database.
With these and other objects, advantages and features of the invention that may become hereinafter apparent, the nature of the invention may be more clearly understood by reference to the following detailed description of the invention, the appended claims and to the several drawings attached herein.