Transaction records often include information related to the identities of one or more parties to a transaction. For example, a transaction record may include the name of a vendor from which a customer purchased a product or service. Such a vendor may be referred to as a transaction counterparty relative to the customer, and may be identified within a transaction record that records the customer's purchase. The transaction record may be generated by the vendor, a financial institution associated with the customer, any combination thereof, or any other entity associated with the transaction. A transaction record may be provided to the customer after the transaction occurs, or to any entity related to managing and/or recording transactions (e.g., on behalf of a customer).
However, transaction records often also include a variety of additional information. Some of the additional information describes aspects of the transaction other than the parties, such as the location of the transaction, the method of payment, the amount of the sale, codes associated with a specific point-of-sale, etc. Furthermore, some of the additional information may be used by the entity causing the creation of the record (i.e., codes related to products, sales, consumers, etc.), which may be used by the entity (e.g., a vendor, a bank, etc.) for any purpose during record keeping activities.
The additional information included in a given transaction record is often difficult to recognize and/or parse when attempting to discover a counterparty in a transaction using the transaction record. The difficulty often arises from the unpredictable structure of the transaction record, which may change from transaction to transaction, from vendor to vendor, from financial institution to financial institution, etc. The difficult and varying structure of most transaction records leads to schemes for transaction counterparty identification that require significant levels of manual intervention, which may render a given scheme tedious and require significant amounts of time and effort. If the additional information (i.e., other than the transaction counterparty) could be removed or reduced when processing transaction records, then transaction counterparty identification would be improved. However, methods and systems for automatically removing such additional information, which may be referred to as noise words or noise ngrams, from transaction records do not currently exist. Thus, it is difficult to use transaction records for a wide variety of purposes that benefit from proper transaction counterparty identification.