The present invention is directed to a system and method for processing signal data for signature detection. More specifically, the system and method are directed to adaptively conformed imaging of work pieces which may be of disparate configuration. In various embodiments and applications, the system and method provide for such adaptively conformed imaging in automated manner to build an electronic database of specimens or other targeted items sufficient not only for quick and efficient indexing for reference purposes, but also for image based identification and taxonomic classification purposes, among others.
In certain illustrative applications, the system and method operate upon work pieces comprising pinned biological specimens such as insects and the like. One class of such specimens includes specimens from man-made collections which are maintained over many years, often being organized and re-organized during the period. While a typical range of configurations is observed, the consistency of these specimens is hardly precise by machine handling standards, and generally tend to widely vary over time, depending on the curation personnel and procedures at hand and the degree/frequency of user access.
Typically, collected specimens in these contexts are pinned work piece assemblies—that is, impaled on a pin support, usually along with corresponding labels bearing the specimens' identity and/or other descriptive information. It is a relatively easy task for a human user to intelligently examine such a work piece assembly, identifying which parts of it are of interest, which parts of it constitute support structure, and where (if anywhere) it is labeled. The same task is extremely difficult for an automated machine. In particular, if a human user wishes to measure, image, or otherwise collect data on such specimen, they must intelligently assess the item, determine how they can orient it to see the subject of interest and how they can orient it to read and record information from any label that may be attached.
As a practical example, consider FIG. 1A which shows examples of insects from storage drawers at a facility, within a typical, large institutional collation, that houses millions of such specimens. Note that although the drawers appear reasonably well organized from a human point of view, they are in fact highly disorganized from a machine processing point of view. Insect specimen pins are not located at regular intervals, wings and support structures lean and overlap, and the individual pinned insect configurations vary widely. Note also that the insect mounting structures vary significantly from one another. In some cases, the insect specimens are impaled but on a single pin, in others, they are impaled on a sub-pin (termed “minutin” in the art) coupled to a main pin. In other cases, the insect specimens are supported on a piece of form or paper tab. Generally, labels are situated under the insect specimen, but the format, size and angle of labels vary significantly, as do the writing or print found thereon, which may range, for example, from 17th century handwriting to laser printing. In some cases, multiple stacked labels are provided for the same specimen.
While a human handler can easily recognize where the insect specimen is mounted and where the label is located, it is an extremely challenging task for a machine system to make the same distinctions. There are no suitable measures known in the art for sufficiently automated handling of such specimen work pieces to automatically capture and record high-resolution images of the specimens and their label data. The task of recognizing a target of interest (the specimen) in an arbitrary supporting structure, and determining how to image the target within appropriate limits to preserve visual angles and avoid collisions with the work piece, remain remains a significant challenge in the art. In addition, the ability to recognize, isolate, and capture a label on a work piece within these limits, and doing so in a manner that optimizes post-facto readability (either by a human or by Optical Character Recognition (OCR)) remains another great challenge in the art.
Potential applications of data recovered from collected specimens such as those shown in FIG. 1A are significant as biological collections are largely untapped sources of valuable data. Each specimen records an anatomical variety of a species captured at a particular space and time. Large digital databases of insect specimens, vast in geographic, historical, and anatomical scope, may help answer previously intractable questions. For example, geo-location forensics is one emerging application in national intelligence and criminal justice. One publicized murder conviction in 2007 rested partly on the identity and geographic origin of insects found on the front grill of a rental car. Similarly, the identity of insects picked from the front grill of a vehicle intercepted at a terrorist facility may provide insight into where the vehicle had been and when, if the specimen can be quickly and accurately identified without other extraneous knowledge as to the insects or their indigenous characteristics. Forensic benefits may be realized also from insects or insect parts recovered from clothing, baggage, individuals, or even explosive devices.
Insects and related arthropods are also known to vector some of the most globally important and deadly diseases. Mosquitoes vector malaria, filariasis, dengue fever, St. Louis encephalitis, West Nile virus, and Eastern and Western equine encephalitides, yellow fever, which collectively result in millions of deaths each year. Ticks, sand flies, assassin bugs, and tse-tse flies are but a few of the other well-known disease vectors. Electronic insect specimen databases established with a suitable measure of consistency and uniformity would provide epidemiologists knowledge of current and past distributions of disease vectors, and facilitate the modeling of potential ranges for both vector and disease, given weather disturbances, global climate change, or other likely perturbations to insect vector distributions. Moreover, the ability to detect anomalies may provide early warning of significant threats, whether natural or manmade.
In US agriculture, billions of dollars are lost annually to crop damage or spent on combating pests. Exotic invasives arrive on a regular basis. The soybean aphid, Asian longhorn beetle, Asian tiger mosquito, emerald ash borer, gypsy moth, Argentine fire ant, Japanese beetle, citrus psylla, and medfly are just a few of the many exotic pestiferous insect species causing significant economic damage. Electronic specimen databases may provide readily accessible information as to the likely origins of such exotic pests. This information may then be used to seek biological control agents in the country and locality of origin, and give immediate insight into the pest's biology, aiding the development of control methods.
For these and other reasons, accurately and consistently digitizing specimen records in standardized manner from natural history collections is a critical task. Such data would be crucial for documenting species declines or extirpations, the historical presence of particular species, changes in species distribution, or range expansion of invasive species. Ecological restoration, assessments of biodiversity, conservation, and the documentation and prediction of the effects of global climate change all require these historical and modern insect records.
The aforementioned applications all rely on accurate insect species identification. Presently, professional taxonomists are employed to identify specimens. One of the principle functions of the USDA Systematic Entomology Laboratory, for example, is to identify “urgents,” insects intercepted at US ports of entry whose identity may determine the fate of inspected cargo, which often may be worth millions of dollars. Standardized image capture and processing can support automated means of insect identification, modernizing and relieving systems now reliant on a shrinking and aging taxonomist workforce, and increasing the speed of identification where timely diagnoses are increasingly needed, such as in intelligence operation and port intercept applications.
To enable the advances discussed above, massive insect collections must be digitized; the Smithsonian Institute alone presently houses over 35 million specimens. These specimens are irregularly sized, mounted and labeled to a variety of standards, housed in tightly packed drawers, in compactor-cabinet rooms having a combined footprint that could be measured in acres.
Yet, current digitizing practices rely almost exclusively on human labor, with individuals manually typing in label contents. Typical data entry proceeds at a rate and cost wholly incapable of handling the enormous numbers of specimens. At 5 minutes per specimen for an experienced pair of curators to mount, illuminate, obtain clear photographs at several angles, and manually enter the textual information into a computer database, it would take over 300 years to digitize the existing Smithsonian collection. High-resolution photography often adds upwards of a half hour per specimen.
At present, there is no feasible way to rapidly digitize and catalog these specimens. Even new specimens cannot be photographed and recorded as fast as they arrive. Slide- or flatbed-scanners are sometimes employed in the rare circumstances where smaller, soft-bodied specimens are preserved on microscope slides. Certain known devices have been used for recording and cataloging pinned specimens; however, they record and catalog drawer contents en masse and do not provide for any high-quality individual specimen imagery, nor provide for any label data capture. Nor do such known devices provide any possibility at all of imaging the bulk of specimen surfaces that are simply not visible when packed in a box or drawer.
There is therefore a need for a system and method whereby non-standard, or irregularly configured work pieces such as these pinned specimens may individually be electronically imaged, recorded, and cataloged in standardized manner. There is a need for such system and method by which standardized imaging and recording of the specimen data may be accomplished in automatic yet adaptively conformed manner for disparately configured work pieces. There is a need for such system and method which may carry out the standardized imaging and recording quickly and efficiently.
Insects serve as just one example of specimen collections needing digitization in this regard. Other biological specimen collections have similar significant and important applications that can be enabled by large scale digital capture. Natural collections, archived object collections, and forensic examinations are other areas where massive digital capture would potentially enable watershed changes in the art.
Typical high-throughput robotic handling systems used in manufacturing and other settings rely largely on standardization of the work pieces. Parts handled are machine made or mounted in highly standardized handling units. Adaptive handling is employed for some items of moderate variation—such as fruit, for example—but these pieces are substantially similar in their geometric configuration and handling is not typically required to be millimeter precise. Thus, the need for adaptive, fine scale manipulation of parts not inherently designed or labeled for mechanized handling.
A general need arises in any instance where it becomes desirable to adaptively manipulate a class of work pieces with variable configuration, and in particular, where such manipulation is dependent upon an automated identification of specific target features within a composite work piece (such as a specimen, a region of a specimen, a label, etc.) so that the target feature may be precisely imaged or otherwise treated while avoiding interference from the remainder of the work piece.