All references, including any patents or patent applications, cited in this specification are hereby incorporated by reference. No admission is made that any reference constitutes prior art. The discussion of the references states what their authors assert, and the applicants reserve the right to challenge the accuracy and pertinency of the cited documents. It will be clearly understood that, although a number of prior art publications are referred to herein, this reference does not constitute an admission that any of these documents forms part of the common general knowledge in the art, in Australia or in any other country.
Following the successful completion of the complete sequence of the human genome in the Human Genome Project, and corresponding successes with other genomes such as the mouse and the rat, there is an urgent need in the art to determine the function of the proteins which these genomes encode, and to determine how these proteins are expressed during various physiological states and in disease.
Proteomics is an area of research which seeks to define the function and relative expression profiles of subsets of proteins encoded by a given genome at a given time in a given cellular location. Proteomics separates, identifies, and characterizes the proteins expressed, retained, secreted or released by a cell or tissue in order to establish their function(s) and their potential relationship to the onset, type, stage and progression of diseases, as well as response to therapy and/or relapse.
Proteomics may be used to compare tissue samples from diseased and healthy people, in order to identify proteins whose expression is changed in disease. Proteins which are significantly altered in their expression, location or post-translational modification (PTM) in patients with a disease, compared to those in a group of healthy individuals, may represent protein targets for drug or discovery of biological markers, for example, endpoint and/or surrogate biomarkers. One application of proteomics is in the search for biological markers of disease onset, progression and treatment in elements of the blood, such as serum or plasma.
Serum proteins are useful diagnostic tools, and alteration of the expression of some serum proteins is an early sign of an altered physiology, which may be indicative of disease. In routine diagnostic laboratories, identification of specific low abundant disease-associated proteins in serum relies heavily on time-consuming and expensive radiolabelled or enzyme-linked immunoassay methods (RIA or ELISA) which only have the ability to evaluate a single protein component at a time. Due to the heterogenous nature of most physiological disorders, it is generally considered that no single marker is likely to be sufficiently predictive of disease, so that there is a need for more than one candidate biomarker to enhance already available diagnostic or prognostic tests. It has been suggested that a panel of multiple diagnostic/prognostic markers in serum can be identified by utilizing proteomic approaches which have the capacity to profile multiple biomarkers (Daly and Ozols, 2002).
One primary tool used in proteomic methods for protein separation and analysis of proteins is two-dimensional gel electrophoresis (2DE). Following separation by 2DE, proteins are characterized and identified, usually using matrix-assisted laser desorption interferometery (MALDI) peptide mass fingerprinting or other forms of advanced mass spectrometry, for example, electrospray mass spectroscopy (MS) or time-of-flight (TOF)/TOF MS, or surface-enhanced (SELDI-TOF MS), laser desorption ionization time-of-flight mass spectrometry coupled to protein and genomic database searching.
Unfortunately, the analysis by 2DE gels of proteins in samples of biological fluids such as serum and plasma is very difficult. This is because of the limited amount of protein able to be resolved by a gel, and the great variation in the concentration of proteins in many samples. This variation in concentration is frequently referred to as “dynamic range”. These factors result in data obtained by 2DE from complex samples, such as unfractionated serum and plasma, being dominated by the presence of proteins which are of high abundance in blood, for example human serum albumin, immunoglobulin G (IgG), haptoglobin, fibrinogen, transferrin, α1-antiptrypsin, α2-macroglobulin, IgA, and IgM. Of these, six (albumin, IgG, IgA, α1-antitrypsin, transferrin and haptoglobin) constitute 85-90% of the protein mass in blood serum. Proteins with a concentration higher than 1 mg/mL are generally considered to be of high abundance, and such proteins may represent 2-60% of the total protein present.
Thus the application of current proteomic technologies is limited by the presence of high abundance “housekeeping” proteins like albumin and immunoglobulins, which constitute approximately 60-97% of the total serum protein (Georgiou et al, 2001). Such proteins hinder the detection of hundreds of low abundance proteins, some of which might potentially be relevant to a particular disease state. Moreover, the widely spread pattern of albumin and immunoglobulin in the 2-DE gel can also obscure proteins with a similar pI and molecular weight. Theoretically, by removing albumin and immunoglobulin, which together constitute 60-97% of the total serum protein, 3-5-fold more protein can be analyzed. If proteomic technologies are to be used routinely for diagnostic purposes, a rapid, inexpensive and simple method is required to remove the high abundant proteins.
In particular, the presence of these abundant proteins severely limits the utility of methods used in wide scale analysis of proteins present in complex mixtures of proteins, such as single dimension electrophoresis (IDE), 2DE, multi-dimensional liquid chromatography and MS. These methods are often used in the investigation of low-abundance proteins such as cytokines, signal transduction proteins, hormonal mediators, and cancer biomarkers. The dynamic range problem is illustrated in FIG. 1, which shows the results of 2DE of a sample of unfractionated human plasma. This illustrates the problem presented by very abundant proteins, such as albumin, which comprises more than 80% of the total protein present in plasma; see the circle in FIG. 1. As the total amount of protein which can be loaded on to a gel is limited to less than approximately 120 mg, the maximum amount of “non-albumin” proteins which can be loaded is limited to approximately 36 mg, thus limiting the ability of this technique to visualize and identify putative clinically-relevant low abundance biomarker proteins. Rare proteins may be difficult if not impossible to detect. Similar, although less extreme, dynamic range problems are experienced with 2DE analyses of other types of biological samples, such as urine, tissue extracts, and cell lysates.
One approach to solving this problem is to develop methods for removing albumin and other highly abundant proteins from blood samples such as serum and plasma before analysis, thus increasing the sensitivity of the analysis and hence the likelihood of identifying low abundance protein biological markers. In particular, a method of removal of the 50 to 100 most abundant proteins from plasma before analysis would be greatly advantageous, in order to permit the use of higher relative mass loading of samples.