Performing automated diagnosis functions by analyzing medical images using computer vision applications is a very complex and challenging task. To accurately perform automated diagnosis functions, the computer vision applications must account for a variety of technical problems. One such technical problem relates to training a model that can accurately perform object segmentation on the images to detect medical objects (e.g., lesions or cancerous cells) of interest with pixel-level accuracy. In many cases, this can be difficult because the medical objects often are very small and can have large intra-class variations, which results in the model failing to identify some or all of the objects in the images. Another technical problem relates to training a model that can accurately predict classification labels associated with diagnosing a disease or medical condition. The accuracy of the predictions can be negatively affected if the medical objects are not accurately identified and/or the model is unable to distinguish between similar, but different, medical objects (e.g., different types of lesion conditions or cancer conditions).
Another technical problem relates to providing an appropriate training procedure that can be used to train the object segmentation and disease grading models. Although it may be preferable in many cases to employ a fully-supervised learning approach in which all training data is fully annotated, it is not practical to do so because the available training data is often very limited and the process of annotating medical images is expensive given that it typically requires the very time-consuming dedication of medical domain experts. This is especially true for pixel-level annotations that identify the medical objects of interest. On the other hand, utilizing a purely unsupervised learning approach can also be unacceptable in many cases due to the limited accuracy of the models that can be generated using such approaches.