Assessment of changes in tumor burden is an important feature for defining tumor response in clinical trials. Both tumor shrinkage (objective response) and development of disease progression are important endpoints in clinical trials as these often determine objective response, which in turn define time to progression (TTP) and progression-free survival (PFS). In order to standardize tumor response assessment in clinical trials, various response criteria have been described, including Response Evaluation Criteria in Solid Tumors (RECIST) version 1.0 or more commonly version 1.1, modified RECIST (mRECIST), World Health Organization (WHO) Criteria, Choi Criteria, Vascular Tumor Burden (VTB) Criteria, Morphology Attenuation Size and Structure (MASS) Criteria, immune-related Response Criteria (irRC), immune-related RECIST (irRECIST), Cheson Criteria, Lugano Classification lymphoma response criteria, Positron Emission Tomography Response Criteria in Solid Tumors (PERCIST), European Organization for Research and Treatment of Cancer (EORTC) Response Criteria, Response Assessment in Neuro-Oncology (RANO) Criteria, International Myeloma Working Group (IMWG) consensus criteria, etc.
In order to assess objective response, an estimate of the overall tumor burden at baseline is needed and used as a comparator for subsequent measurements. Each tumor response criteria specifies parameters that define a measurable lesion at baseline. For example, RECIST 1.1 defines a non-nodal lesion as measurable if it measures cm in long axis at baseline and defines a lymph node as measurable if it measures cm in short axis at baseline. When one or more measurable lesions are present at baseline, each tumor response criteria specifies which lesions should be considered as target lesions. Target lesions are typically selected based on being the largest in size or most metabolically active but also should lend themselves to reproducible repeated measurements. Most tumor response criteria limit the number of total target lesions and limit the number of target lesions per organ. For example, RECIST 1.1. limits the total number of target lesions to 5 and the total number of target lesions per organ to 2. Each tumor response criteria specifies how the target lesions should be measured. For example, RECIST 1.1 states that non-nodal lesions should be measured in the longest dimension on axial cross-sectional images, while lymph nodes should be measured in short axis on axial cross-sectional images. The total tumor burden is then a mathematical calculation made from the individual target lesions. For example, the sum of the diameters (longest for non-nodal lesions, short axis for nodal lesions) for all target lesions is calculated and reported as the baseline sum diameters per RECIST 1.1.
The baseline measurements are used as a reference to characterize objective tumor regression or progression in the measurable dimension of the disease. All other lesions (or sites of disease) are identified as non-target lesions. The site of disease of all non-target lesions should be recorded at baseline. At subsequent time points, measurement of non-target lesions is not required, and these lesions are typically followed and defined as ‘complete response’ (CR), ‘unequivocal progressive disease’ (PD), ‘non-CR/non-PD’, or ‘not evaluable’ (NE). Alternatively, the non-target lesions could be qualitatively evaluated, such as ‘present’, ‘absent’, ‘larger’, or ‘smaller’.
While most tumor response criteria utilize measured changes in target lesion length or size as a means of defining objective response, some criteria (e.g., PERCIST and EORTC Response Criteria) utilize measured changes in target lesions radiotracer activity as a means of defining objective response, and other criteria use a combination of both. Different tumor response criteria may utilize different metrics, mathematical calculations, or cut points to define objective response, and computers implemented methods that automate one or more processes or method acts and/or ensure user compliance with one or more criteria may be used to reduce errors and improve efficiency in tumor response assessment.
A common method for determining objective response in a phase 2 or 3 industry-sponsored oncologic clinical trial includes a combination of local radiologic review (LRR) and independent central review (ICR). With LRR, local physician reviewers (often a radiologist) generally do not receive training on the protocol or the specific response criteria, multiple different local physician reviewers may interpret the images, different local physician reviewers may choose and measure different target lesions, and local physician reviewers are often unaware of the subject enrollment date and therefore do not have the ability to judge objective response. Reports generated by LRR may not initially conform to protocol-specific case report forms (CRFs), and the local investigator team typically utilizes the LRR report and translates this information into a CRF per the clinical trial study protocol.
ICR of images is advocated by regulatory authorities as a means of independent verification of clinical trials endpoints dependent on medical imaging. In this context, ICR is the process by which all radiologic exams and selected data acquired as part of a clinical trial study protocol are submitted to a central location and reviewed by independent physician reviewer(s) who are not involved in the treatment of the patients. The independent physician reviewer(s) are blinded to various components of the data, frequently including blinding to treatment arm, patient demographics, assessments made by the investigator, and the results or assessments of other physician reviewers participating in the review process. With ICR, the independent physician reviewer(s) undergo training on the specifics of the study protocol and response criteria, the same physician reviewer(s) follow all patients throughout the study, the same target lesions are followed throughout the study, and the independent physician reviewer(s) fill out the CRF and determine objective response in comparison to the baseline study or lowest tumor burden (nadir). The workflow process is more tightly regulated and standardized with ICR than with LRR. With many phase 2 and 3 studies, ICR is by two primary physician reviewers who independently review each patient's images, and a third adjudicating physician reviewer resolves discordant results when the two primary physician reviewers disagree.
There is frequently discordance among different physician reviewers, resulting in discordance between LRR and ICR and between central physician reviewers participating in ICR. Factors influencing discordance include target lesions selection, inter- and intra-reader differences in target lesion measurement technique, mathematical and data transfer errors, target lesion selection errors, errors in following objective response criteria, workflow differences, limited amount of clinical information, treatment bias, handling of missing data, variability in protocol training, variability in understanding of and application of tumor response criteria, failure to compare to all prior studies, perception of new lesions, subjective assessment of non-target lesions and perception of unequivocal progression of non-target lesions, tumor type, drug efficacy, precision of the response criteria, and complexity of the response criteria.
A critical component of any tumor response criteria is the choice of target lesions on the baseline exam. In clinical practice and clinical trials, the choice of target lesions is at the discretion of the physician reviewer, which could be a radiologist, oncologist, radiation oncologist, surgeon, etc. Most tumor response criteria provide guidance on target lesion selection. For example, RECIST 1.1 provides guidance on which lesions are measurable or non-measurable and then provides additional details on how to select target lesions. In general target lesions and lymph nodes are selected based on their size, though the target lesions must be representative of all involved organs and should lend themselves to reproducible repeated measurements.
The single factor that historically contributes the most to discordance in objective response between physician reviewers is the choice of the target lesions on the baseline scan. In patients with multiple potential target lesions, different physician reviewers will frequently pick different target lesions on the baseline exam. For example, in a patient with multiple potential target lesions in multiple organs, one physician reviewer may select two target lesions in the lungs, two in the liver, and one lymph node while another physician reviewer may pick two different target lesions in the lungs, one in the liver, one in the adrenal, and a different lymph node. Each potential target lesion may grow or regress at a slightly different rate, contributing to different objective responses between physician reviewers that choose different target lesions.
Furthermore, tracking of target lesions over time is advantageous for obtaining accurate and precise objective response. Conventional methods for tracking target lesions include recording target lesions size, organ location, and image number or slice position on CRFs. Some image viewing workstations also keep track of key images. Even with these techniques, local physician reviewers often do not have access to the CRFs or key images of other reviewers, leading to variability in longitudinal tracking of target lesion growth and regression. Similarly, conventional commercial image viewers do not include sophisticated target lesion tracking systems that are readily available when evaluating subsequent time points.