Due to the ever-increasing popularity and accessibility of the Internet as a communication medium, the number of business transactions conducted using the Internet is also increasing, as are the numbers of buyers and sellers participating in electronic marketplaces that provide a forum for these transactions. The majority of electronic commerce (“e-commerce”) transactions occur when a buyer determines a need for a product, identifies a seller such as a supplier that provides the product, and accesses the supplier's web site to arrange a purchase of the product. If the buyer does not have a preferred supplier or is purchasing the product for the first time, the buyer will often perform a search for a number of suppliers that offer the product and then access numerous supplier web sites to determine which supplier offers certain desired product features at the best price and under the best terms for the buyer. The matching phase of e-commerce transactions (matching the buyer with a particular supplier) is often inefficient because of the large amount of searching involved in finding a product and because once a particular product is found, the various offerings of that product by different suppliers may not be easily compared.
In general, computer-implemented automatic classification involves using one or more software components to classify product-related content (e.g., product description information) received from buyers or sellers into appropriate product classification schema. A schema can include a set of product classes (which can be referred to as a “taxonomy”) organized in a hierarchy, with each class being associated with a set of product features, characteristics, or other product attributes (which can be referred to as a “product ontology”). For example, writing pens can have different kinds of tips (e.g., ball point or felt tip), different tip sizes (e.g., fine, medium or broad), and different ink colors (e.g., blue, black, or red). Accordingly, a schema can include a class corresponding to pens, which has a product ontology including tip type, tip size, ink color, or other appropriate attributes. Within a class, products may be defined by product attribute values (e.g., ball point, medium tip, blue ink). Product attribute values can include numbers, letters, figures, characters, symbols, or other suitable information for describing a product.
Previous techniques for automatically classifying product-related content into schema have used pattern matching methods involving keyword comparisons. The performance of previous techniques has typically been adequate for most needs. However, given the ever-increasing number of e-commerce transactions being conducted, and the ever-increasing number of product searches being performed, the performance of previous techniques is increasingly insufficient for certain needs.