A genome holds the “genetic blueprint” that determines the genotypic make-up of every organism, and its individual susceptibility or resistance to diseases. However, the onset of many common chronic diseases is not predictable and does not follow Mendelian family patterns. Instead, these diseases appear to be caused by a usually unknown number of genes interacting with various environmental factors. A few examples of these diseases include coronary heart disease, hypertension, diabetes, obesity, various cancers, Alzheimer's disease, Parkinson's disease, and others. Even with the same set of genomes, two organisms can develop different functional cells in different organs and tissues, in different stages of development and disease progression. For humans, exercise, diet, social interaction, psychology, and environmental factors (e.g., toxins) all play an important role in making each person have a unique protein portfolio, resulting from the interaction of many genes and his environment. Even in the same person, the expression pattern of a gene in different tissues, organs, or cells may be different due to different time and environmental controls.
Therefore, it is unrealistic to expect that individual variability in disease onset and drug treatment outcomes will be wholly attributable to genetic causes. Personalized medicine must include testing for variations in proteins in addition to genes, gene expression, and metabolites. The test results are correlated with drug response, disease state, prevention, or treatment prognosis, and they help physicians individualize treatment for each patient with greater precision. Since the dynamic nature of protein state reflects a person's real time physiology, choosing the right treatment (the right drug in the right dose) for the right patient at the right time would be more relevant to diagnosis or prognosis, defining disease states, assessing risk profiles and outcomes, and setting up individual therapeutic strategies.
Cancer remains a devastating disease throughout the world. The diagnostic classification of a cancer used to be based on the organ or tissue location where it originated in the body. However, the malignant cells that constitute a tumor are markedly heterogeneous. For example, ˜38 leukemia and ˜51 lymphoma types have been identified, and cannot be explained only by differences between genomic information. Despite advances in diagnostic technologies, surgical management, and therapeutic modalities, the long-term survival is still poor in most patients suffering from cancer due to the fact that the majority of cancers are detected in their advanced stages and some have distant metastases, rendering treatment ineffective. The detection and the classification of a patient by key protein spectrums inside and/or on the surface of the cell would provide new information on how rapidly the cancer might spread and how it might respond to specific treatments, and also provide possibility for early diagnosis, which would lead to intervention and treatment long before clinical signs and symptoms appearance. Early detection and classification can also help us to understand preclinical molecular events and detect potential patients at risk. Instead of adopting a trial-and-error approach, physicians can now choose the most effective medication with the fewest side effects from the start. Also, new assays can be developed in clinical diagnostics, where all the available assays are designed for consensus biomarkers and can not detect the protein signals of unknown danger.
The detection of characteristics determining key protein spectrums is of vital importance in accelerating the understanding of disease biological processes, which, in turn, facilitate the discovery of new drug targets and diagnostic disease markers. Thus, the identification of phenotype-specific or disease related protein spectrums are becoming increasingly important in biology and medicine.
The current strategy for detection of protein function or disease related protein spectrums involves the comparison of proteomic profiles of contrast samples. The core proteomic analysis technologies for the separation of proteins and/or peptides are one- and two-dimensional gel electrophoresis, and one- or more-dimensional liquid chromatography, coupled almost exclusively with mass spectrometry. A comparison is then made between the protein profiles in different samples. Co-analysis with two dimensional gel electrophoresis of samples coded with different fluorescent dyes, such as Cy2, Cy3, or Cy5, offers higher degree of reproducibility of sample comparison than separation side-by-side or sequential analyses of one sample at a time.
An important improvement is the two dimensional chromatography fractionation of samples, followed by isotope labeling of proteins. A comparison is then made using mass spectrum profiling. The protocols include heavy isotope (e.g., 15N, 13C, and 18O) incorporation and isotope-coded labeling reagents, which are described in WO01/94935; WO03/102220; US20050069961; US20050100956; US2002/0168644; U.S. Pat. No. 6,670,194; WO03/102018; and WO00/11208. A set of other labeling reagents has been used to label a plurality of samples. The protocol of combining them before or after selecting/enriching for labeled molecules and co-assay together for reliable comparison is described in US20050074794.
Another important improvement is the use of a limited number of immuno-affinity methods to remove more than 20 high abundant serum proteins, which indeed improves the detection proteins that are low in abundance (Yocum A K et al., J. Proteome Res. 2005, 4:1722-1731; Schuchard M D et al., Origins 2005, 21:17-23.) However, the task of comprehensive profiling and characterization of all the proteins in a given sample is overwhelming. Through alternate gene splicing and post-translational modifications, approximately 35,000 genes in the human genome could generate about 100,000-500,000 potentially expressed proteins. It is estimated that the complexity of proteomics may be in the range of 30,000-50,000 proteins of a given sample. Environmental, nutritional, and developmental circumstances have direct effects on the dynamics of protein expression, which leads to an even greater molecular complexity and variations between individuals or same individuals at different circumstances. Hundreds of thousands proteins are present (cells, tissue, serum, etc.). Current technologies lack sensitivity to allow detection of a few different proteins reliably and repeatedly.
As a result, there is a need for more efficient methods to screen, identify and characterize differential proteins between samples.