Modern radiation therapy is highly customized per patient plan and, with the advent of intensity modulated radiation therapy and intensity modulated radiation therapy (IMRT) and volume modulated arc therapy (VMAT), treatment plans can be very complex in nature. This precipitates the need for customized and stringent verification to ensure that: 1) the treatment planning system (TPS) calculates the patient dose accurately; and 2) the delivery system delivers the dose accurately. The process of dose verification of complex plans can be generally called dose quality assurance (QA), and will be referred to as “Dose QA” from this point forward. (Note that a common term in the industry is “IMRT QA”; but this is too limiting in its literal sense as not all modern plans are by definition IMRT.) Modern Dose QA purposes and methods have been well described in literature (e.g., B. E. Nelms and J. A. Simon, “A survey on planar IMRT QA analysis,” J. Appl. Clin. Med. Phys. 8(3), 76-90 (2007); G. A. Ezzell et al., “IMRT commissioning: Multiple institution planning and dosimetry comparisons, a report from AAPM Task Group 119,” Med. Phys. 36(11), 5359-5373 (2009); V. Feygelman, G. Zhang, C. Stevens, B. E. Nelms, “Evaluation of a new VMAT QA device, or the “X” and “O” array geometries,” J Appl Clin Med Phys. 12(2), 146-168 (2011); B. E. Nelms, H. Zhen, and W. A. Tomé, “Per-beam, planar IMRT QA passing rates do not predict clinically relevant patient dose errors,” Med. Phys. 38(2), 1037-1044 (2011); and H. Zhen, B. E. Nelms, and W. A. Tomé, “Moving from gamma passing rates to patient DVH-based QA metrics in pretreatment dose QA,” Med. Phys. 38(10), 5477-5489 (2011)—the contents of which references are herein incorporated by reference in their entirety).
Dose QA performance must be quantified, and quantification requires metric(s) of performance. Acceptance of performance level (safety, accuracy, etc.) implies verifying vs. benchmarks and setting clear acceptance criteria. This general strategy, of course, relies on the metric(s) of performance being a good metric, i.e. a good indicator/predictor of quality. Scientifically and statistically speaking, a good performance metric will be both: a) sensitive and b) specific.
Sensitivity and specificity can be defined using results falling into one of four main categories, illustrated below. Nelms et al have clearly translated these categories in terms of Dose QA (see FIG. 1).                True Positive: “Sick” person correctly diagnosed as sick (in Dose QA: unacceptable dose correctly detected as unacceptable).        False Positive: “Healthy” person incorrectly diagnosed as sick (in Dose QA: acceptable dose incorrectly detected as unacceptable).        True Negative: “Healthy” person correctly diagnosed as healthy (in Dose QA: acceptable dose correctly detected as acceptable).        False Negative: “Sick” person incorrectly diagnosed as healthy (in Dose QA: unacceptable dose incorrectly detected as acceptable).        
Sensitivity can be defined broadly as the ability to correctly detect a problem. In the case of medicine, sensitivity is the ability of a test to correctly diagnose a sick patient as sick. In Dose QA, sensitivity is the ability to correctly detect an error when there is an error of clinical relevance. Sensitivity can be quantified by the following equation:
  Sensitivity  =            Number      ⁢                          ⁢      of      ⁢                          ⁢      True      ⁢                          ⁢      Positives              [                        Number          ⁢                                          ⁢          of          ⁢                                          ⁢          True          ⁢                                          ⁢          Positives                +                  Number          ⁢                                          ⁢          of          ⁢                                          ⁢          False          ⁢                                          ⁢          Negatives                    ]      
Specificity can be defined broadly as the ability to correctly identify a negative result. In the case of medicine, specificity is the ability of a test to correctly diagnose a healthy patient as healthy. In Dose QA, specificity is the ability to correctly identify that there are no clinically relevant errors due to the calculation or delivery of the dose. Specificity can be quantified by the following equation:
  Specificity  =            Number      ⁢                          ⁢      of      ⁢                          ⁢      True      ⁢                          ⁢      Negatives              [                        Number          ⁢                                          ⁢          of          ⁢                                          ⁢          True          ⁢                                          ⁢          Negatives                +                  Number          ⁢                                          ⁢          of          ⁢                                          ⁢          False          ⁢                                          ⁢          Positives                    ]      
Typically the conventional QA metric is a “passing rate” (%) of calculated dose points vs. measured dose points, where the criteria for passing are a composite of percent difference, distance-to-agreement (DTA) (e.g., J. Van Dyk et al., “Commissioning and quality assurance of treatment planning computers,” Int. J. Radiat. Oncol., Biol., Phys. 26(2), 261-273 (1993)), or a hybrid metric called the Gamma Index (D. A. Low, W. B. Harms, S. Mutic, and J. A. Purdy, “A technique for the quantitative evaluation of dose distributions,” Med. Phys. 25, 656-661 (1998)—the contents of which references are herein incorporated by reference in their entirety). Both the DTA and the Gamma analyses serve to dampen the failures in high dose gradient regions.
In terms of conventional passing rate metrics, the regions of false positives and false negatives are illustrated in FIG. 1, as is what would be expected if these metrics are well correlated to clinically relevant errors, i.e. errors in dose volume histogram (DVH) results for patient dose distributions.
Conventional passing rate metrics, though used for many years in IMRT, were never proven in terms of either sensitivity or specificity. Recent studies of both per-beam planar IMRT methods (e.g., J. J. Kruse, On the insensitivity of single field planar dosimetry to IMRT inaccuracies,” Med. Phys. 37(6), 2516-2524 (2010); G. Yan, C. Liu, T. A. Simon, L. C. Peng, C. Fox, and J. G. Li, “On the sensitivity of patient-specific IMRT QA to MLC positioning errors,” J. Appl. Clin. Med. Phys. 10(1), 120-128 (2009)—the contents of which references are herein incorporated by reference in their entirety) and 3D composite dosimetry methods have proven the passing rates to be poor metrics in terms of both sensitivity and specificity, for all common methods. In other words, conventional methods/metrics cannot reliably detect significant errors (i.e. they lack sensitivity) nor can they reliably prove accuracy (i.e. they lack specificity). As such, there is a clear need for improved metrics that are not only reliable and useful, but also clinically possible and practical.
The potential limitations of conventional metrics (and especially the 3%/3 mm criteria for both %/DTA and Gamma passing rates) were postulated by Nelms and Simon and, in the same publication, the authors suggest that moving towards prediction of impact of errors on patient dose and DVH would be more useful and relevant. The authors summarize their point well: “The underlying limitation of today's planar IMRT QA approach is that it does not make the connection between the individual field analyses and the “big picture” of how the patient dose distribution might be affected—that is, how the plan DVHs might be degraded as a result of the combined planning and delivery imperfections. Today, the DVH is the critical tool for IMRT dose prescription and plan analysis. An estimated DVH (based on measurements) should perhaps be the new goal of IMRT QA. Although careful field-by-field analyses are now efficient and very effective at detecting differences between the measured fields and the planned fields, they do not predict the overall perturbations of the volumetric patient dose and DVH statistics. If meaningful standards for IMRT QA acceptance testing are to be derived and adopted, that connection needs to be made. Estimating DVH perturbations attributable to IMRT QA measurements would be a wise first step in trying to introduce meaningful standards to IMRT QA, because the benchmarks could be set based on more clinically relevant and intuitive endpoints.”
A software product called “3DVH” (Sun Nuclear Corporation, Melbourne, Fla.) is one answer to solving the problems of Dose QA metrics. 3DVH uses the strategy and algorithm called “Planned Dose Perturbation” (PDP), which uses conventional QA data (measured vs. calculated phantom dose) to accurately estimate the impact of any/all observed errors on the 3D patient dose and DVH. In addition to providing these more useful metrics, the aim of 3DVH is to be clinically practical and cost-effective, specifically by allowing existing and ubiquitous QA devices to gather the required PDP measurement inputs. PDP is further described in U.S. Pat. No. 7,945,022, the contents of which are herein incorporated by reference in their entirety.
One method of dose QA is to deliver all treatment beams at their actual treatment geometries to a dosimetry phantom that acts as a patient surrogate. We will call this “true composite” dose QA (as opposed to “single gantry angle composite” where all IMRT beams are delivered at the same geometry to a flat QA phantom). The dose distribution measured and calculated in the true composite QA phantom will not be equal to the patient dose, of course (due to density and size differences), but there are advantages in delivering a full fraction and verifying the 3D dose, even if it is a phantom dose.
Though IMRT beams are dynamic in nature (moving multi-leaf collimator (MLC) leaves creating intensity modulation) they do not have dynamic beam geometries; rather, they have static beam angles per beam. However, recently dynamic arc therapy has become more commonplace. In arc therapy, the beam geometry (typically just the gantry angle) changes dynamically during a single treatment beam. Arc therapy with C-arm linear accelerators is often generalized as Volume Modulated Arc Therapy (VMAT), though it is sometimes called by vendor-specific commercial names such as RapidArc (Varian Medical Systems) or Smart Arc (Philips Radiation Oncology Systems). Another common method of dynamic arc therapy that delivers dose through a modulated fan beam that rotates in a helical loop around the patient is called helical tomotherapy, with a trade name Tomotherapy (Accuray).
Because of their dynamic beam geometries, arc therapies lend themselves to true composite dose QA rather than per-beam planar dose QA.
True composite dose QA dosimetry phantoms have followed, in a sense, the same evolution as per-beam planar. Namely, the industry has migrated towards electronic 3D arrays which measure dose without the need for time-consuming processing. In between the planar film-in-3D phantom era and the modern 3D electronic array era is wedged a history of using 2D electronic arrays embedded in 3D phantoms, a limited and semi-inaccurate method that remains common due to efficiency and cost-effectiveness. A summary of methods and devices used in true composite dose QA is given in Table 1.
TABLE 1Summary of True Composite Dose QA MethodsMethodBrief Desciption ProductsProsConsFilm in 3DEmbed planes ofManyUser can customizeInefficient (filmPhantomfilm inside 3D(Options formeasurementprocessing andphantomsboth film and plane(s)analysis)phantoms)User can choose their Dose always inphantom (shape,planes, notmaterial, etc.)volumetricHigh densitymeasurements3D Gel3D chemical gelBANG GelHigh densityInefficientthat acts analogousVolumetric(processing requiresto a 3D “film”special equipmentsuch as MR or laserscanning)ExpensiveLimitedmeasurementaccuracy2D Array inPlace a 2D arrayMapCHECKInexpensive (useAngular3D Phantom(ion chamber orMapCHECK2ubiquitous 2D IMRTdependencies causediode) inside a 3DPTW 729QA devices)measurement errorsphantomMatrixxEfficientSingle dose plane, notMapPhan*volumetricOctavius*Low detector densityMultiCube** Phantoms3D ArrayHigh resolutionDelta4EfficientLimited to fixed(small) detectorsArcCHECKVolumetricdetectorembedded in 3DarrangementvolumetricLow detector densityphantom
The ArcCHECK (AC) device has been well-described in literature (e.g., Kozelka J, Robinson J, Nelms B, Zhang G, Savitskij D, Feygelman V, “Optimizing the accuracy of a helical diode array dosimeter: a comprehensive calibration methodology coupled with a novel virtual inclinometer,” Med Phys. 38(9), 5021-32 (2011)—the contents of which reference is hereby incorporated by reference in its entirety). The AC device is unique in its detector geometry, which features a cylindrical surface of diodes with a near-circular cross section of 22 facets and a helical progression of diodes along the long axis of the cylinder.