1. Field of the Invention
The present invention relates generally to linguistic applications or attribute matching in information retrieval and data processing and, more particularly, to systems or methods for measuring similarities between words or between representations of multiple attributes of products or items, especially attributes related to pharmacological products or items.
2. Related Art
Errors in the administration of medications, such as occur when the wrong drug or the wrong dosage are provided to a patient, represent a serious problem that has been much discussed by health professionals, patient welfare groups, academics, insurers, and others. Various causes for these errors have been identified, including the misunderstanding of physicians' orders due to illegible handwriting, similarity between drug names, confusing pharmaceutical packaging, poor design of devices for administering drugs, and other factors. An overview of some systemic causes of medication errors is provided in M. R. Cohen, “Drug product characteristics that foster drug-use-system errors,” 52 Am. J. Health-Syst Pharm (February 1995) pp. 395–399, hereafter referred to as “the Cohen article,” which is hereby incorporated by reference in its entirety. Another overview of the subject is found in M. R. Cohen (ed.), Medication errors, American Pharmaceutical Association, Washington, D.C. (1999).
A variety of groups and government agencies have programs designed to identify the sources of medication errors and to reduce the likelihood of their occurrence. For example, the American Society of Hospital Pharmacists has issued “ASHP guidelines on preventing medication errors in hospitals,” 50 Am. J. Hosp. Pharm. (1993) pp. 305–314; the U.S. Food and Drug Administration (FDA) has established a Subcommittee on Medication Errors; the National Coordinating Council for Medication Error Reporting and Prevention has information available on the Internet (www.usp.org/standard/9805/9805—08a.htm) and elsewhere; and a medication-error reporting network has been established by the nonprofit Institute for Safe Medication Practices and the Drug Product Problem Reporting Network of the U.S. Pharmacopeia, Inc. (USP).
One class of errors that has been identified and studied by these groups and agencies is related to the use of drug names that sound like, and/or look like, other drug names. Lists of these sound-alike or look-alike drugs have been published, as in N. M. Davis, et al., “Look-alike and sound-alike drug names: the problem and the solution,” 27 Hosp. Pharm. (1992) pp. 95–98, 102–105, 108–110; and N. M. Davis, “Drug names that look and sound alike,” in Hospital Pharmacy, vol. 32, pages 1558–70 (1997). Agencies such as the FDA, the United States Adopted Names Council (USAN), the International Nonproprietary Name (INN) Committee of the World Health Organization, the European Agency for the Evaluation of Medicinal Products (EMEA), and the U.S. Patent and Trademark Office (USPTO), have regulations and programs related to the possibility of confusion among drug names. Also, pharmaceutical companies typically expend significant effort in proposing and perfecting trademarks for new drugs.
Notwithstanding the activities of these, and other, organizations, new drugs continue to be given names that may be confused with those of existing drugs, not infrequently leading to serious or fatal consequences for patients. Confusion between Celebrex® and Celexa® is a recent example, as documented in USP Quality Review, May 1999, no. 66 (U.S. Pharmacopeia, Rockville, Md.). Also, existing look-alike or sound-alike drug names remain on the market. One reason for these continuing problems is the diverse, and sometimes conflicting, goals of the agencies and companies involved in the naming of drugs. For example, pharmaceutical companies seek trademarks based not just on the objective of distinguishing their drugs from the competition, but also on enhancing recognition and recall and creating brand loyalty. The USAN and the INN, although concerned with name confusion, are also interested in ensuring that drug names are useful to health care professionals, i.e., that drug names preferably convey some medical information rather than being merely arbitrary or fanciful. Similarly, the USP has an interest in encouraging the use of drug names that are consistent with the existing compendial nomenclature. In contrast, one element used by the courts and the USPTO to determine the likelihood of confusion between trademarks is the strength of a mark. A mark may be strong, and therefor entitled to broad protection, because it has a relatively remote relationship with the product, such as a mark that is arbitrary or fanciful.
Another reason for the continuing problem of drug name confusion is attributable simply to the large number of drugs available. For example, there are over 15,000 medications sold in the United States alone, and there are over 35,000 names in the U.S. Patent and Trademark Office's database of trademark registrations for pharmaceuticals. Approximately half a million pharmaceutical trademarks are registered in the major industrialized countries. Even agencies, such as the FDA, that are focused squarely on reducing medication errors due to name confusion are hard pressed to anticipate sources of name confusion due to the large number of pairs of proposed and existing names, proposed and proposed names, or existing and existing names.
Moreover, assessment of the likelihood of drug-name confusion often is limited by reliance on the subjective judgment of human experts. For example, the FDA employs panels of experts who are directed to make their evaluations based on guidelines that generally are open to subjective interpretation. The inevitable disagreements that arise result in what social scientists commonly refer to as “poor interrater reliability.” Similarly, practitioners before the USPTO, and the examiners and other officials of that agency, must apply complex guidelines (statutory, regulatory, and judicial) that call ultimately for the application of subjective judgments.
Efforts have been made to systematize the analysis of drug names by human experts. For example, the Cohen article refers to a system used by the pharmaceutical industry “for assessing proposed trademarks for possible medication-error problems.” 52 Am. J. Health-Syst Pharm (February 1995) p. 398. More generally, the same article refers to “a system for ranking pharmaceutical labeling and packaging for error potential.” Id. at p. 399. Both systems appear to be based on the participation of experts who, in accordance with an evaluation protocol, apply conventional social-science rating techniques to some factors that are considered to be relevant to the potential, for errors. For example, experts pronounce product names read from handwritten drug orders from physicians, and rank the potential for confusion on a scale of one to ten. Other experts assign point values to each factor. These scores for each factor may then be combined to provide an overall rating between one and ten that is intended to be indicative of the potential for confusion. Id. at p. 398. Although a quantitative rating is thus produced, this approach relies on the subjective judgment of experts. As is evident, the judgment of any person may vary on the same subject from one trial to another, and the judgments of two people may vary on the same subject. Thus, the approach is not deterministic in the sense that a particular input (a handwritten drug order, for example) may produce one output (the quantitative rating) for one trial and the same input may produce another output for another trial. Also, the approach is not “automatic” in the sense that determination of the quantitative rating requires human involvement.
Some computer-implemented techniques have been employed to provide a more objective, and automated, analysis of drug-name similarity. For example, pharmaceutical companies typically screen potential new drug names by computerized searching, apparently based on similarity of spelling and/or sound of the new drug names as compared with existing drug names. Typically, however, regulatory agencies do not require results of these searches to be submitted as part of the evaluation process. Trademark attorneys and commercial trademark-searching firms similarly use computer-based searching techniques, including Internet searching, to assess the likelihood that a registration for a proposed drug name will be granted by the USPTO. The utility to public agencies and the public of these search techniques is limited, however, by the fact that the precise methods by which the searches are conducted generally are not publicly disclosed. Consequently, due to this lack of transparency, and due to uncertainty as to whether the same or similar standards are applied to one or more searches by one or more firms, comparisons of the likelihood of confusion across a wide population of drugs is problematic or impracticable.
Moreover, these automated conventional techniques, have various characteristics that limit their efficacy in reducing confusion among drug names. For example, most of these techniques provide only an approximate relative measure. That is, the searching techniques may simply rank reference names in order of similarity to a target name. Thus, with respect to a target drug named “AAA,” reference drug “AA” may be ranked first, reference drug “A” second, reference drug “AB” third, and so on. Moreover, if additional quantitative information is provided, it may be limited to a simple score that is not tied to a benchmark. That is, according to some conventional techniques, one pair of target and reference drugs may have a score of “1.2” and another pair a score of “0.7,” but neither score provides any absolute measure of the likelihood of confusion. Rather, only a relative measure is provided by these techniques. That is, a score of 1.2 may indicate a higher likelihood of confusion, based on the names of the drugs, than a score of 0.7.