Over the next decade biomarkers are predicted to significantly change the efficiency and economics of drug discovery and development. The pharmaceutical industry is shifting its focus from biomarkers that simply differentiate therapeutic responder/non-responder populations to identifying new biomarkers that are themselves validated as therapeutic targets. It is estimated that if biomarker data could improve just 10% of the critical decisions in the drug development process, then savings of up to $100 million per drug could be achieved (Barton, 2006). The global biomarker market is estimated to reach $20.5 billion by 2014, growing at a CAGR of about 20% from 2009 to 2014. Market growth has been primarily driven by high demand for biomarkers for drug discovery and development.
Presently, oncology is the most active field for disease biomarker research and development, primarily because cancer therapy routinely provides biopsy and surgically excised tissues that are utilized in new therapy development. For example, imatinib targets an enzyme produced as a result of chromosomal translocation discovered to be associated with chronic myelogenous leukemia (Druker, 2004; Baselga, 2006). In breast cancer, the expression of the estrogen receptor is used as a biomarker for prognosis and to identify women who are likely to benefit from antiestrogen therapy (Duffy, 2005; Ariazi et al., 2006), while over-expression of HER2 (a growth factor receptor) serves as a biomarker for prognosis and for treatment with trastuzumab (Yeon and Pegram, 2005; Duffy, 2005; Baselga, 2006). However, despite these notable achievements, relatively few patients ever benefit from biomarker-guided therapy since most biomarkers are identified in only a few percent of the population, and very few are sufficiently validated as drug targets. Most candidate biomarkers never advance beyond the discovery phase, and the number of biomarkers validated for use in drug development or qualified for clinical applications is still very small (Duffy, 2005; Hayes, 2005; Gasparini et al., 2006). There is a critical need to identify more biomarkers, especially early stage cancer biomarkers, as therapeutic targets and to develop drugs that can benefit the rest of the cancer patients.
Biomarker research exploded primarily due to the use of proteomics approaches focusing on identifying differences in protein structure and abundance between diseased and normal states. Once identified, these biomarker proteins can be utilized for developing diagnostic tools, and because they are functional molecules, they are also more likely to be valid therapeutic targets.
The accessibility and presence of a large number of proteins in blood plasma make it an excellent matrix in which to search for new biomarkers. However, the estimated dynamic range of various protein concentrations in human serum is up to 10 orders of magnitude (Corthals et al, 2000), making the rapid identification of individual disease-associated proteins a tremendous analytical challenge. While total serum protein concentration is approximately 70-90 mg/ml, most useful biomarkers, such as cytokines and prostate specific antigen, are present in the picogram range, and disease-specific changes can be expected to be incrementally small, especially in the early stages of disease (Merrell et al., 2004). Compounding these problems, many disease-specific proteins (e.g. cancer biomarkers) are degraded inside the cell by proteolytic enzymes, generating peptide fragments that are subsequently released into the blood. Being low molecular weight in nature, these peptide fragments generally have a half-life of only about two hours and most of them are cleared from circulation by the kidney (Lowenthal et al, 2005).
In order to overcome challenges presented by low concentration and rapid turnover of potentially useful peptides, the albumin-associated fraction of proteins and peptides have been investigated as a source of useful new disease-specific biomarkers. Albumin, the most abundant plasma protein (40-50 mg/ml), functions as a scaffold for binding small molecules, lipids, and proteins in the extracellular space. It has been found to form complexes with peptide hormones such as insulin and glucagons; bradykinin, serum amyloid A, interferons, the amino-terminal peptide of HIV-1, gp41, and the 14-kDa fragment of streptococcal protein G, among others. Interestingly, it was found that a small percentage of the secreted peptide fragments from degraded cancer proteins have high affinity for serum albumin complexes which increase their half-life to about 19 days (Lowenthal et al, 2005). By their association with serum albumin to form complexes, the longevity of these cancer-related peptide fragments (cancer peptide motifs) can be increased by approximately 60 to 100-fold (Dennis et al, 2002). Due to its high affinity for such a diverse range of ligands, the serum albumin population is expected to be highly heterogeneous, most likely comprising hundreds of different albumin complexes.
Techniques currently in use were designed to separate proteins or peptides and they cannot be used to separate serum albumin complexes. For example, current technologies for protein and peptide separation include electrophoresis (one-dimensional and two-dimensional; capillary, etc.), chromatography (reversed-phase, ion exchange, size exclusion, affinity, etc), and solvent precipitation. Different combinations of these multidimensional separation technologies have been used to separate a mixture of proteins. For example, a sample is separated into individual spots or fractions using different separation techniques, and the individual proteins are then analyzed by mass spectrometry to establish their identity. A “shotgun” strategy was also developed where, without prior separation, entire samples containing a mixture of a large number of different proteins, such as plasma or serum, are proteolytically digested into peptides by trypsin (He, et al, 2005). The peptides in the tryptic digest are then separated by multidimensional separation techniques and then analyzed by mass spectrometry to establish the identities of the proteins present in the sample. It was hoped that these multidimensional separation technologies would offer significant enhancement in sensitivity for low abundance proteins by removing the masking effect of the highly abundant proteins, thereby enabling deeper penetration into the plasma proteomes. However, since none of these technologies separate serum albumin complexes, they have not yielded useful biomarkers.
Even the most widely used technology for protein separation, 2-dimensional polyacrylamide gel electrophoresis (2-D PAGE), introduced by O'Farrell (1975), cannot separate serum albumin complexes, as it is typically conducted under “denaturing” conditions. Additionally, 2-D PAGE has many other shortcomings including requiring large amounts of samples (about 50 to 100 μg of protein per experiment) and producing a rather streaky and mostly diffused profile when serum is analyzed. Furthermore, proteins separated by 2-D PAGE are required to be “blotted” or transferred onto blotting membranes such as polyvinylidene difluoride (PVDF) for Western blot analysis. The efficiency of protein blotting is also variable.
As described in WO 2011/008746, the present inventors developed a new electrophoresis procedure that separates protein complexes directly on the PVDF membrane, thus bypassing both the cumbersome, time-consuming gel electrophoresis and its subsequent blotting steps (Chang and Yonan, 2008; Chang et al., 2009). The separation of albumin complexes in the present inventors' 2-D High Performance Liquid Electrophoresis (2-D HPLE) is based on their net charge or isoelectric points (pI). The association of a newly produced cancer peptide fragment (cancer peptide motif) with a pre-existing albumin complex changes its pI and this new complex migrates to a different location on the PVDF membrane, allowing its detection among hundreds of already present albumin complexes. Because it focuses on disease-specific protein fragments, the technique enables not only the identification of new cancer protein biomarkers, but also identifies the cancer peptide motifs within these proteins. When LC/MS/MS analysis is preceded by fraction separation using 2-D HPLE, its dynamic range is enhanced to the 1010 range required for detecting low copy number cancer biomarkers, a sensitivity that has not previously been achieved using other protein separation techniques.
Using a yeast two-hybrid screen with the carboxyl-terminal tail of a G protein-coupled receptor (delta opioid receptor) as bait, Whistler et al., (2002) discovered that one of its interacting partners is G-protein coupled receptor-associated sorting protein 1 (GASP-1). GASP-1 was later found to interact with cytoplasmic tails of many other G-protein coupled receptors including D2 dopamine receptor/DRD2, beta-2 adrenergic receptor/ADRB2 and D4 dopamine receptor/DRD4 (Simonin et al., 2004). GASP-1 is involved in silencing signals by targeting receptors for degradation in lysosomes and functional down-regulation of a variety of G-protein coupled receptors (GPCRs).
In the same study by Whistler et al (2002), important structure-function experiments were performed and identified the C-terminal fragment of GASP-1 (cGASP-1) as the domain of the protein that binds to lysosomes. Interestingly, these truncated forms of GASP-1 (cGASP-1) were also shown to function as dominant negative mutants inhibiting degradation and favoring the recycling of delta opioid receptor. These experiments provided the first evidence that GASP-1 had the capacity to regulate either the recycling or degradation of GPCRs. Since then, the trafficking of numerous GPCRs has been shown to be regulated by GASP-1 (Moser et al., 2010).
As described in WO 2011/008746, the present inventors have used 2-D HPLE to investigate disease-specific albumin complexes in plasma from patients with breast cancer (Chang, et al, 2009; Tuszynski et al., 2011). One of the newly identified serum albumin complexes from Stage I breast cancer was cut out from the PVDF membrane after 2-D HPLE and subjected to on-membrane digestion with trypsin. The cancer peptide motif was identified as a 16 amino acid sequence of EEASPEAVAGVGFESK (SEQ ID NO: 1) by liquid chromatography with tandem spectrometry sequencing of individual peptides (LC/MS/MS). Protein identity was determined from database searches of virtual tryptic peptide databases and fragmentation spectra of tryptic peptides. This 16-amino acid sequence came specifically from GASP-1. No studies had previously linked GASP-1 to cancer pathogenesis. However, the present inventors used polyclonal antibody raised against the 16-amino acid sequence from GASP-1 to detect the expression of GASP-1 and its fragments in tumor extracts of cancer patients (Chang et al., 2009). It was found that GASP-1 was expressed in all 7 cases of late stage (Stage II and Stage III) breast cancer patients but not in adjacent normal tissue as revealed by Western Blot analysis (Chang, et al., 2009; Tuszynski et al., 2011). Thus, the 2-D HPLE process not only discovered the 1,395 amino acid GASP-1 as a new late stage cancer protein biomarker but also identified specifically the 16-amino acid residue cancer peptide motif (covering amino acid residues 850 to 865) in this protein.
Currently, no reports on biomarkers for the detection of early stage cancer are available. People are regularly told to watch for early symptoms of cancer. However, by the time symptoms occur, many tumors have already grown quite large and may have metastasized. Moreover, many cancers have no symptoms. There remains a need for biomarkers of early stage cancer to enable the detection, diagnosis, and treatment of cancer at its earliest stages of development.