The simple act of naming an entity (e.g., biological entity) that is part of a large, complex classification or taxonomic system has potentially far-reaching and long-lived consequences. Names, especially those ascribed to organisms, serve as a primary entry point into the scientific, medical, and technical literature and figure prominently in countless laws and regulations governing various aspects of commerce, public safety and public health. Biological names also serve as a primary entry point into many of the central databases that the scientific community and the general public rely on. While legalistic Codes of Nomenclature exists that govern the formation and assignment of names to proposed taxa, the process of biological classification is not governed by any formal mechanism. Taxonomies represent the scientific opinions of the individuals who create them, and may be of varying quality or consistency. Hence, legitimate and valid names may be ascribed to poorly formed taxa and illegitimate and invalid names may be assigned to well-formed and/or correctly identified taxa. Moreover, biological names are neither unique nor permanent. A single organism can bear multiple names (synonyms) that represent differing taxonomic opinions that may have been rendered either in sequence or in parallel. Instances of homonymy also occur, in which a single name may refer to more than one group of organisms that are of markedly different evolutionary lineages (e.g. bacteria and insects). Orthographic variants may also occur, arising from correction of nomenclatural errors.
This disjunction between nomenclature and taxonomy leads to an accumulation of dubious names in the literature and databases. While experts in taxonomy and biological nomenclature may be able to recognize and correctly interpret such circumstances, few others have the requisite skills to do so, resulting in frequent misapplication of names and misinterpretation of the taxonomic record. From a practical, legal, or regulatory sense, either incorrect nomenclature or errors in classification or identification can have significant and unintended consequences. For example, these errors may lead to the addition or removal of biological species to lists of tightly regulated organisms such as those appearing on the CDC list of Restricted Select Agents, those governed by the USDA APHIS program, those covered by the Endangered Species Act, or those restricted by packaging and shipping regulations. The use of biological names as a means of information retrieval is not reliable as these names are neither unique nor persistent.
What is needed is a method of persistently disambiguating the relationship between names and biological taxa, so that information keyed on a given name will be retrievable in the future, across a networked environment, regardless of whether or not that name is still considered applicable by contemporary standards. Such a method should also retrieve all of the information regarding a given organism or group of organisms bearing multiple synonyms and orthographic variations in a single query.