Breast cancer is a heterogeneous disease which presents challenges for clinicians in predicting the likelihood of disease progression, particularly in patients where the disease is detected in the early stages. For these women, the conventional clinico-pathological parameters (tumour size, lymph node status, patient age, tumour grade, and expression of biomarkers including Estrogen Receptor (ER), Progesterone Receptor (PR), Human Epidermal growth factor Receptor 2. (Her2), Ki67) are not sufficient to characterise disease complexity and accurately predict the likelihood of tumour recurrence following adjuvant treatment or tumour removal by surgery. Therefore, due to inaccurate risk stratification, many of these patients who are inherently at a low risk of recurrence are assigned to receive chemotherapy, when in fact the majority of these women would remain cancer-free even without this toxic treatment.
In fact, it is estimated that, for node-negative, ER-positive disease, up to 85% of patients would be overtreated if given chemotherapy (Fisher et al., 2004). Furthermore, surviving patients treated with chemotherapy face a higher risk of developing a second, independent, primary cancer in unrelated tissues within their lifetime (Boffetta and Kaldor, 1994). Considering the severe side-effects, the public health burden and the future health implications of chemotherapy, the overtreatment of patients represents a major problem in the clinical management of early-stage breast cancer.
The challenge is to develop a method of accurately and reproducibly distinguishing the low-risk from the high-risk patients so that therapy can be assigned accordingly. Current guidelines often lead to differing opinions from breast oncologists as to whether to assign neoadjuvant and/or adjuvant therapy, as many are reluctant to forego neoadjuvant and/or adjuvant therapy without a reliable assessment of recurrence risk. The addition of more accurate and reliable prognostic and predictive biomarkers to the standard clinical assessment would greatly improve the ability of both doctors and patients to make more well-informed treatment decisions. Some progress is being made in this regard with the multigene assays Oncotype Dx® Breast Cancer Assay and MammaPrint™, which are currently being assessed in the Trial Assigning IndividuaLized Options for Treatment (Rx) (TAILORx) and Microarray In Node-negative and 1 to 3 positive lymph node Disease may Avoid ChemoTherapy (MINDACT) trials, respectively (Cardoso et al., 2008; Sparano, 2006). MammaPrint™ and Prosigna™ are examples of Food and Drug Agency-approved prognostic tests in this arena.
WO 2005/039382 describes a number of gene sets used in predicting the likelihood of breast cancer recurrence, otherwise known as Oncotype Dx® referred to above. The invention is related to a gene set comprising ‘one or more’ genes from a panel of 50 genes. WO 2104/130825 describes a gene set comprising least 4 genes from a panel of cell cycle genes for detecting risk of lung cancer. U.S. Pat. No. 7,914,988 describes a gene expression signature to predict relapse in prostate cancer, known as the GEX score. The invention is related to a gene set comprising ‘all or a sub-combination of’ genes from a panel of 21 genes.
The widespread use of gene expression profiling has led to a rapid expansion in the identification of gene expression signatures found to correlate with different aspects of tumour progression. These include the ‘poor prognosis’ (van de Vijver et al., 2002; Wang et al., 2005), ‘invasiveness’ (Liu et al., 2007), and ‘genomic grade’ (Sotiriou et al., 2006) signatures. US 2008/275652 describes how this genomic grade signature comprises at least 2 or 4 genes selected from a panel of 97 genes. However, despite the ability of these signatures to predict breast cancer prognosis, there is surprisingly little overlap between signatures. The Applicants suggest that many genes in these signatures may be ‘passengers’, rather than ‘drivers’ of tumour progression. Recent advances in genome-wide reverse engineering have made it possible to successfully identify regulatory interactions between transcription factors and downstream genes which were causal rather than correlative (Carro et al., 2010). One such algorithm, the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) (Margolin et al., 2006), uses gene interaction networks constructed from transcriptomic datasets to identify ‘hubs’, usually transcription factors, which are predicted to directly regulate multiple genes in the signature.
It is an object of the present invention to overcome at least one of the above-mentioned problems.