This invention relates in general to the field of image processing and, more particularly, to a multi-resolution label locator in an automated parcel sorting system.
Automated sorting of parcels is becoming very popular because it reduces labor costs while providing fast and reliable parcel delivery services. However, since parcels rarely have the same size and shape, automated parcel sorting that employs image processing to identify address labels becomes very complicated and may be prone to label reading errors.
To capture an image of an address label of a parcel with sufficient quality for a human operator to read and then to key-in the destination address, a camera must scan the surface of a parcel at a relatively high resolution. A high resolution image results in large parcel images and correspondingly large data storage requirements. One problem in the automatic sorting of parcels is processing high resolution parcel images at a rate equivalent to the output of the mechanical section or conveyor system of the automatic parcel sorting system.
In addition to large image processing time, another problem in high resolution image processing of parcels is locating the destination address label. Even with high resolution images, the human operator must still look up, down, or across a screen displaying the image to identify the location of the destination address label. Such eye scans significantly reduce the efficiency of an automatic parcel sorting system.
Other automated parcel sorting systems have attempted to improve efficiency by eliminating the need of a human operator to read and key-in destination addresses of a label. Such other automated parcel sorting systems include devices that employ fiduciary markings and systems that rely on the leading edge of packages having a known shape.
Automated parcel sorting systems that employ fiduciary marks use optical character recognition (OCR) to ascertain the location and orientation of an object or text affixed to an object. For example, an OCR reader system scans a parcel bearing a fiduciary mark and locates the fiduciary mark. In this manner, a fiduciary mark which is placed in a known relation to the destination address block can be used by the OCR system to locate the position of the destination address block. Similarly, an orientation specific fiduciary mark whose orientation is placed in a known relation to the orientation of the text within a destination address block can be used by an OCR system to ascertain the orientation of the text.
While fiduciary mark systems may improve efficiency, these systems require each parcel receiving site to have identical fiduciary markings so that each OCR system can recognize a particular fiduciary mark. Therefore, such systems generally require preprinted labels or parcels comprising the fiduciary mark and specifying a markable area for placing text. Preprinted labels and preprinted parcels are expensive and some percentage of customers will inevitably fail to use them.
For other systems that do not employ fiduciary marks and preprinted labels, the leading edge of parcel with a known shape is utilized to determine the orientation and location of text on a parcel. However, similar to the fiduciary mark systems, these systems do not afford flexibility in the size and/or shape of parcels.
Accordingly, there is a need in the art exists for an automatic parcel sorting system that can readily identify destination address labels within a scanned image of a parcel, regardless of the size and/or shape of the parcel. There is a further need in the art for an automatic parcel sorting system that significantly decreases the amount of time required to process an image or to acquire destination address label data from a scanned image.
The present invention is a multi-resolution label locator that provides a list of one or more areas within a processed image of a parcel that may contain labels of interest. The multi-resolution label locator is typically part of an automatic parcel sorting system.
The automatic parcel sorting system typically includes a video camera mounted adjacent to a conveyor apparatus. The video camera is operatively linked to two video processors, which produce at least two different kinds of image signals of a parcel. The video processors produce a first decimated (low-resolution) image of the parcel and a second image that corresponds to edge-occurrences of indicia expected to appear on a label, such as text.
The two images produced by the video processor identify different characteristics of the original high resolution image. For example, the decimated-image hardware of the video processor may identify areas in the image that have characteristics typical of labels, whereas the edge-occurrence processor may identify areas that have characteristics typical of text.
The two images are fed into a separate microprocessor, which employs a multi-resolution label locator program to identify one or more areas on the parcel that may contain a label of interest. The multi-resolution label locator program then classifies these areas and compiles a list of these candidate areas based on data extracted from the first and second images produced by the video processor.
Generally stated, the invention is a multi-resolution label locator system for an automatic parcel sorting system. The multi-resolution label locator system obtains a video signal containing a plurality of pixels that define an input image of a substrate. The multi-resolution label locator system divides the input image into a plurality of multi-pixel cells. In subsequent computations, the multi-resolution label locator system extracts feature values corresponding to the preprocessed decimated image and edge-occurrence image.
The multi-resolution label locator system then creates the decimated image (low resolution image) corresponding to the input image in order to reduce the amount of data in the subsequent computations. This decimated image is generated by utilizing a common-characteristic value, such as a single pixel, that corresponds to each multi-pixel cell of the input image. Each common-characteristic value represents a decimated image of the pixels within the corresponding cell. For example, if the multi-resolution locator system is designed to locate labels on a package or parcel, then the system will look for large, relatively white contiguous areas (or areas having a different color depending on the operating environment of the present invention) on the package or parcel since labels generally have a different color or reflect light at a different intensity relative to the package or parcel. Those regions of the parcel or package having a higher light intensity or different color value are assigned a decimated-image value and this data is then mapped to an image space to create the decimated image.
With this decimated image, the feature extraction function implemented on the microprocessor can efficiently extract feature parameters of the label candidate areas. Some of the feature parameters may include: normalized dimensions and areas of the label candidates, aspect ratios, and the relative average light intensities of potential label candidate areas derived from the decimated image. These feature parameters become the input data for the classification function (also discussed infra).
While the first video processor of the multi-resolution locator system is generating the decimated image, the second video processor of the multi-resolution label locator system simultaneously creates an edge-occurrence image that corresponds to the input image. The edge-occurrence image includes an edge value that corresponds to each cell of the input image. Each edge value represents the number of occurrences of edges within the pixels of a corresponding cell of the input image. For example, if the multi-resolution locator system is designed to locate address labels on a package or parcel, the locator system will look for closely spaced black and white transitions, since text on address labels has such characteristics. Bar codes also have black and white transitions, but the transitions are aligned in a uniform orientation. On the other hand, transitions within handwritten or typed text on labels tend to have a random orientation. The multi-resolution locator system therefore utilizes these characteristics to distinguish an address label containing text from a bar code label.
After generating the edge-occurrence and decimated images, the multi-resolution label locator system identifies one or more candidate areas within these images that have decimated-image characteristics and edge-occurrence characteristics corresponding to the characteristics of interest. This identification includes further processing of the separate images. Specifically, the multi-resolution label locator program then classifies the candidate areas according to the likelihood of the input image containing indicia having the characteristics of interest. Based on these characteristics, the multi-resolution label locator module then compiles a list of one or more candidate areas that most likely contain indicia having the characteristics of interest.
The multi-resolution label locator system creates the decimated image by computing a histogram of pixel values occurring within each cell of the input image. For example, the common-characteristic value or pixel value may correspond to the approximated color for each pixel. The multi-resolution label locator system then selects from the histogram a mode value corresponding to the pixel value that most frequently occurs within a respective cell of the input image. The multi-resolution label locator system then sets a respective common-characteristic value in the decimated image for the cell to the mode value.
To identify one or more candidate areas within the decimated image having characteristics corresponding to the expected characteristics of the indicia, the multi-resolution label locator system computes a common-characteristic histogram corresponding to the decimated image. The multi-resolution label locator system then smoothes the common-characteristic histogram with both a low-pass filter and an adaptive-moving-window filter.
To separate label candidates from a parcel background, the multi-resolution label locator system selects one or more peak values from the filtered common-characteristic histogram and isolates a peak region around each peak value by identifying upper and lower bounding valley values. The multi-resolution label locator system then creates a segmented image by mapping the pixels within each peak region into a blank image corresponding to the decimated image. Subsequently, the multi-resolution label locator system identifies one or more connected components within the segmented image that correspond to the characteristics of interest. This produces a segmented image in which blobs or candidate areas are circumscribed by a bounding window or box.
For each bounding window, the multi-resolution label locator module computes one or more feature values that can include geometrical characteristics of the bounding window and/or relative average-light-intensity values for cells within the bounding window. Other feature values can include normalized dimensions of the bounding windows, normalized areas for the bounding windows, and aspect ratios for the bounding windows. Typically, these feature values are invariant with respect to the orientation and lighting of the camera. In other words, these feature values do not change if the camera orientation is modified or if background lighting changes. After the feature values are obtained, the multi-resolution label locator module then assembles a feature vector including the bounding window feature values, and the feature values for the area within the bounding window.
To create the edge-occurrence image, a black/white threshold function of the first video processor of the multi-resolution label locator system binarizes the pixel values within each cell of the input image. To binarize pixel values within a cell of the input image, the multi-resolution label locator system applies an adaptive binarizing technique to the pixel values within the cell to select a threshold for binarizing the pixel values based on the identified background pixel values. The multi-resolution label locator system then identifies transitions in expected orientations among the binarized pixel values within each cell. The multi-resolution label locator system then computes a totalized edge-occurrence value for each cell based on transitions within the cell and sets the edge value for each cell to the totalized edge-occurrence value for the pixels within the cell.
The multi-resolution label locator system identifies these transitions in a particular cell by comparing the pixel values within the cell to a plurality of templates that define pixel patterns that are among the characteristics of interest. The multi-resolution label locator system then totalizes transitions in expected orientations among the binarized pixel values within the cell by also defining counters for each orientation. For each template, the multi-resolution label locator system compares instances of each template to non-overlapping, contiguous portions of the cell having the same size as the template such that each pixel of the cell is compared to at least one instance of the template. The multi-resolution label locator system then identifies one or more matching pixel patterns within the cell that correspond to a pixel pattern defined by the template. The multi-resolution label locator system identifies an orientation associated with the pixel pattern and increments one or more of the counters in response to the occurrence of each matching pixel pattern.
To compute the totalized edge-occurrence value for each cell based on the transitions and their respective counter values, the multi-resolution label locator system applies a totalization formula that filters the counter values to increment the totalized edge-occurrence value in response to random orientations that indicate the presence of text within the cell. With this totalization formula, the multi-resolution label locator system avoids incrementing the totalized edge-occurrence value in response to uniform or parallely spaced transitions that indicate the presence of a barcode within the cell. This allows the multi-resolution label locator system to eliminate candidate areas within the input image that correspond to barcode labels that do not contain text and hence destination address information.
The multi-resolution label locator system may compute many different feature values for each bounding window. One feature value includes a normalized height representing a ratio of a height defined by the bounding window to height defined by the segmented image. Another bounding window feature value includes a normalized width representing a ratio of width defined by the bounding window to width defined by the segmented image. An additional bounding window feature value includes a normalized area representing the ratio of an area defined by the bounding window to an area defined by the segmented image. Another bounding window feature value includes an aspect ratio representing a ratio of the width defined by the bounding window to the height defined by the bounding window.
In addition to the bounding window feature values, the multi-resolution label locator system can compute many different feature values that correspond to the average light intensity for cells within the bounding window. The multi-resolution label locator system may compute a feature value based upon a normalized edge-occurrence intensity representing a ratio of the sum of edge-occurrence values for cells within the bounding window to a total number of cells within the bounding window. The multi-resolution label locator system may also compute a feature value based upon a normalized edge-occurrence intensity representing a ratio of the sum of the totalized edge-occurrence values for cells within the bounding window to an area defined by the bounding window. To remove noise when computing the normalized edge-occurrence intensity (the transition intensity for the preferred embodiment), the multi-resolution label locator system zeroes totalized transition values for cells within the bounding window below a predefined threshold value.
Based upon the feature value characteristics, the multi-resolution label locator system may pre-classify candidate areas by applying threshold values that are typical of the characteristics of interest. For example, if the multi-resolution locator is designed to find destination address labels on a package or parcel, the multi-resolution locator can eliminate candidate areas based upon a size of the area since labels typically have a minimum and maximum size. The multi-resolution label locator system can then eliminate one or more candidate areas having a corresponding bounding window defining an area below a predefined minimum threshold value. Similarly, the multi-resolution label locator system can eliminate one or more candidate areas having a corresponding bounding window defining an area above a predefined maximum value. In addition, the multi-resolution label locator system may crop one or more candidate areas to correspond to a bounding window having a predefined size centered about a center of mass computed for the feature values of the corresponding candidate area.
After pre-classifying candidate areas, the multi-resolution label locator system classifies the candidate areas according to the likelihood of containing indicia having the characteristics of interest by comparing respective feature vectors of respective candidate areas. To create a list that classifies the candidate areas, the multi-resolution label locator system computes a first decision value corresponding to one of more of the bounding window feature values by comparing the bounding window feature value to an expected value of the bounding window feature value. In this case, the expected value of the bounding window feature value is among one of the characteristics of interest. For example, in a label locator design, the bounding window of an actual label may have a predetermined expected area, a predetermined expected perimeter, and/or a predetermined expected aspect ratio.
After computing a first decision value based on the bounding window feature values, the multi-resolution label locator system then computes a second decision value corresponding to one or more the remaining feature values (i.e., other than the bounding window feature values) by comparing the feature values to expected values of the feature values. The expected values of the feature values are also among the characteristics of interest.
After computing decision values, the multi-resolution label locator system may list candidate areas in a prioritized order by defining a decision space having a plurality of decision sub-spaces. The multi-resolution label locator system then calculates the decision spaces and maps the feature vectors to the decision spaces based on the relative values of the bounding window and feature values of the feature vectors.
The present invention may be embodied in a video image system operable for receiving a data stream comprising pixel values defining an input image and processing the pixel values to locate indicia within the input image having characteristics of interest. The video-image system typically includes a first image video processor operable for dividing the input image into a plurality of multi-pixel cells. The video image system also creates a decimated image corresponding to the input image comprising an element corresponding to each cell of the input image.
Each element of the decimated image represents a common characteristic, such as an average light intensity, of the pixels within a corresponding cell of the input image. To generate the decimated image, the first video image processor includes a buffer memory operative to serially receive pixel values. The first video image processor is typically implemented within a field programmable gate array (FPGA) connected to the buffer memory and operative to receive a pixel stream. The first video image processor further includes a static memory device and is configured to perform its operations as the pixels flow through the FPGA.
In addition to the first video processor, the video image system typically includes a second video processor operable for creating an edge-occurrence image corresponding to the input image comprising an element corresponding to each cell of the input image. Each element of the edge-occurrence image represents the number of occurrences of an edge within the pixels of the corresponding cell of the input image.
Like the first video processor, the second video processor is typically implemented within a FPGA. To create the edge-occurrence image, the second video image processor typically includes a buffer memory operatively linked to a plurality of shift registers. The plurality of shift registers are operatively linked to a dynamic memory device.
A third video processor, preferably configured as a software system running on a general purpose computer, identifies one or more regions within the decimated image having characteristics corresponding to the expected characteristics of the indicia. The third video processor combines the decimated image and the edge-occurrence image and classifies candidate areas according to the likelihood of these areas containing indicia having the characteristics of interest. After classifying, the third video processor compiles a prioritized list of one or more candidate areas that most likely contain indicia having the characteristics of interests.
To combine the data of the first and second video processors and to compute the prioritized list of candidate areas, the third video image processor includes a central processing unit and memory storage device. The third video image processor is operable for identifying one or more candidate areas within the input image having decimated image characteristics and edge-occurrence image characteristics corresponding to the characteristics of interest. The third video image processor is further operable for classifying the candidate areas according to the likelihood of containing indicia having the characteristics of interest and compiling a prioritized list of the one or more candidate areas that most likely contain indicia having the characteristics of interest.
The present invention provides a system operable for locating labels having characteristics of interest on a moving stream of parcels or packages. The system includes a package, a conveyor operable for moving the package, and video device positioned adjacent to, and typically above, the conveyor. The video device scans each package as each package passes by the video device. The video processor, operatively linked to the video device, generates a decimated image and edge-occurrence image of the package.
To evaluate the decimated image and edge-occurrence image, the system further includes a microprocessor operatively linked to the video processor. The microprocessor compiles a prioritized list of one or more candidate areas that most likely contain indicia having the characteristics of interest.
That the invention improves over prior automated parcel sorting systems and accomplishes the advantages described above will become apparent from the following detail description of the exemplary embodiments and the appended drawings and claims.