The use of digital imaging of human anatomy, for medical or other purposes, is ubiquitous and comes in many forms and methods of acquisition. Digital imaging encompasses the conversion (“digitization”) of analog imaging media into digital representations, such as scanning a physical x-ray film, as well as the use of digital detectors on devices such as computed tomography (CT) machines. In the later case, the imaging data exists only digitally. The wide use of computer-based imaging and display systems has been spurred by enhancements in speed, reduction in terms of cost of materials and storage space, as well as ease of transfer, storage, and display, among other factors.
While digital imaging provides computerized representations of a patient's anatomy, the delineation of individual structures (“delineated anatomy” or “contours” of structures such as tissues, organs, and etc.) within those images can be important for diagnostic and/or therapeutic purposes. In clinical practice, this delineation (also called “contouring”) of a patient's anatomy is performed with user-driven, computer-based tools and/or computer-vision-based automated techniques (“auto-contouring”). The delineated anatomy serves a number of useful purposes: in radiology, for example, delineated anatomy can be used to aid in the detection of tumors from screening images; while in radiation oncology, the delineated anatomy can be used to guide and optimize the planning of cancer treatment.
Due to current technological limitations in resolving structures/objects of interest in digital images (arising from insufficient contrast, resolution, or both), as well as the variability in anatomy from person to person, the anatomy delineation process is prone to error. As a result, before clinical use, the delineated anatomy requires evaluation and verification. In current practices, this requires one or more users to conduct a manual evaluation to assess the accuracy of the delineated anatomy and ensure proper surface contouring and labeling. This manual evaluation is both time consuming and relies on user expertise, alertness, and other human factors to identify potential errors in the delineated anatomy. Failure to detect errors in delineated anatomy can lead to complications ranging from negligible to catastrophic within medical procedures that rely on this data. As such, evaluation of the anatomy delineation accuracy is a mandatory and very important step in all cases.
The importance of this quality assurance (QA) for delineated anatomy has been recognized, as well as the limitations of the current system of evaluation. One major impediment to enhancing and automating this process has been the development of a system to benchmark the delineated anatomical structures against. Two primary solutions previously developed for this purpose are: atlas-based systems; and systems that assign a patient-specific “gold standard” dataset.
In an atlas-based system, a population-based “standard” atlas of structures is developed as the benchmark; and subsequent sets of delineated anatomy are then compared to the atlas. Validation of such atlas-type algorithms is typically performed using Dice coefficients or other similarities indices, generally against some defined landmarks (i.e., a set of points, frequently user-determined).
Atlas-based algorithms, however, suffer from a number of limitations. There are a number of different variations in anatomical structures not only between person to person, but even between populations of persons (based, for example, on sex, height, weight, ethnicity, socioeconomic status, medical history, among many other factors). Thus, these methods generally result in some broad dataset (one “man” or “woman”, for example), which results in limited overall utility. Furthermore, atlases, once created, tend to remain fixed in terms of what they contain, thus analyzing an object or structure not contained within the atlas is not an option for an end-user. Finally, an atlas' reliance on landmark-based similarity coefficients may limit the accuracy of the evaluation to locations near the points of interest. The accuracy of employing point-based similarity coefficients is, as a result, largely dependent on the number of points/landmarks used.
Likewise, systems using a patient-specific “gold standard” dataset also suffer from a number of limitations. Particularly, a gold standard approach requires that some set of delineated anatomy determined for a given person be designated as the “true” set of structures for that person. This can be done, for example, by assuming that the first set of a series of delineated anatomical structures for a patient is the “truth,” such as the first of a time series of images acquired throughout therapy to determine how one or more structures of interest change shape, size, and/or position. Such “gold-standard” algorithms are limited by the fact that the “truth” dataset must itself be validated manually and is thus prone to errors that would then be propagated through all subsequently analyzed images. Point-based similarity indices are also frequently used with these algorithms, carrying with that the limitations as noted above.
There is, therefore, a need in the art for decreasing the time required for evaluating the accuracy of delineated anatomies as well as for increasing the robustness and reliability of these evaluations in identifying any errors in the evaluated delineated anatomy. At the same time, there is a need in the art for an evaluation system that provides those benefits while also allowing for flexibility and customizability to evaluate an array of image modalities, populations of persons, and/or anatomical structures of interest.