Previously, attempts have been made to develop systems for identifying pills by medication type from images (optical or laser) taken of the pills. If accurate, such systems would be useful for the ability to effectively identify medications that may have been separated from their original packaging, or about which there is otherwise some uncertainty or need for confirmation as to their identity. For example, a user could take a picture of an unknown pill, and be told what the medication type and dosage is for that pill. However, tolerances for inaccuracy or misidentification of a pill can be small, given the importance of properly identifying and administering the right prescriptions. Moreover, the availability, reliability and timeliness of a pill identification is critical, yet is dependent upon system architecture and available system resources. In this sense, as existing systems become more complex and adopt specialized systems to attempt to become more accurate, they sacrifice reliability and timeliness. And, in many cases, they become essentially unavailable to the average member of the public given their complexity.
More specifically, existing approaches to pill identification through image analyses have not been accurate enough, due in large part to their inabilities to overcome the challenges that stem from inherent variability in image attributes (such as white balance, focus, blurriness, the orientation of the pill, whether the pill is centered in the image, and ambient color casting to name a few). The development of a pill recognition system is thus challenging in part because, in real-world scenarios, the quality of photos taken by users varies dramatically, and can be easily deteriorated by a variety of factors such as illumination, shading, background color, blurriness, and phone orientation. Two images of the same pill may have drastically different appearances depending on, for example, the lighting conditions, angles, phone settings, etc. For example, FIGS. 1A-1F depict pairs of images of the same pill, wherein the left image in each pair suffers from various types of degradation including blurriness, shadowing, and background blending. Moreover, the number of types of prescription pills manufactured by pharmaceutical companies is very large, and they may appear similar or identical to other pills. Existing solutions are not capable of handling such large number of types and thus fail when the problem scale is increased considerably. This is the case despite the fact that each manufactured pill (prescription or otherwise) may be uniquely characterized by, for example, the combination of its shape, color, and particularized imprints.
Attempts to overcome these image variability challenges have resulted in increased complexity and inflexibility. For example, some techniques aim to simply avoid image inhomogeneity altogether, by trying to force the user to make the user's images more uniform. FIG. 2 depicts one such attempt, which relied upon a wallet-sized target card 10 with a known border 12, and checkerboard pattern 14 onto which the unknown subject pill(s) 16 would be placed. This system forced the user to try to center the image, take the image from a certain distance, and try to correct white balance. Even then, the system was still insufficiently accurate as it relied on a predetermined and inflexible set of features to be analyzed. For example, the system attempted to measure the real world length and width of the pill 16 (attributes that might be distinguishing to the human eye, but not necessarily as distinguishing to a computer), which again is dependent on how far away the image was taken from the pill 16 and necessitates the use of the predetermined checkerboard background 14. As another example, some systems have used enclosed chambers that force more uniform image conditions, often using laser images to detect pill contours. Such systems may be too complex for the average individual, and would generally be unavailable for use in emergency situations or in any environment other than the lab in which the system was set up.
Other attempts have involved the use of machine learning to improve the ability of a system to distinguish pills despite the presence of image inhomogeneities. However, these systems suffer from two problems of their own. First, the computational demands of such systems make them difficult to implement. Generally they require either high memory and computational resources onsite, or a fast and robust network connection to a remote computational service. Additionally, even despite the increased amount of computation involved, such systems still are not accurate enough in the face of the types of variability that can be encountered from untrained users taking images.
Conventional machine learning-based approaches to recognition problems are generally very expensive in terms of computation, memory, and power consumption. Given the resource constraints of the types of devices that a user would want to employ for a mobile, flexible pill recognition system (e.g., cell phones, handheld readers, etc.), most of the existing deep learning-based mobile vision systems offload the workload to another computational resource (for example, relying on “cloud”-based computation or transferring data to another networked computer for processing). However, this approach can, depending upon the circumstances, suffer from one or more of several undesirable shortcomings. For example, if network connectivity is lost or otherwise unavailable, the task at hand cannot be accomplished because the data cannot be transferred for processing, and/or results cannot be retrieved. There can be unpredictability in the delay that will be experienced task-to-task and day-to-day. For example, depending on the distance between the device and the networked computer, the network bandwidth available, the quantity of data to be transferred, the current demands on the processing power of the networked computers, scheduled maintenance, power loss, etc., there can be great variability in the length of time required to achieve a pill recognition task—and even whether the task can be performed at all.
Even aside from the computational and memory needs (and concomitant security and reliability risks) of such machine learning systems, they still have not been shown to be sufficiently accurate. Because these systems rely on simple or conventional machine learning algorithms, they are unable to focus the feature recognition result (i.e., the “learnings” of the algorithms) on the proper distinguishing features of images. Thus, they are more susceptible to errors caused by image fluctuations and, as a result, cannot handle the range of possible image types that a user might submit to the system.
In view of the above, a need exists for a system that may provide a simple to use, yet robust and highly accurate pill recognition system, and may be constrained in terms of its computational and memory needs such that it may be operated on a mobile device (or other similar device having limited computational and/or memory resources) without the need to rely on external computing resources. It is within this context that embodiments of the present disclosure arise.