Despite being an important part of bottom-up proteomic protocols, RP-HPLC is still viewed as a “simple sample preparation technique” employed prior to mass spectroscopy (MS) analysis. Recent trends in the development of proteomic procedures have shown the growing utility of peptide RP retention prediction for protein identification and quantification (for example, in scheduled multiple reaction monitoring/selected reaction monitoring (MRM/SRM) protocols). A number of peptide retention prediction models have been recently developed.8,9 However, future advances in this direction still require a better understanding of a peptide's RP LC separation mechanism. This is particularly true for “bottom-up” proteomic approaches, where separation of thousands (if not millions) of peptides is required.1 
Reversed-phase chromatography and MS separation techniques utilize different properties of the species for fractionation. MS possesses much higher separation power and is based on the well-studied principles of “gaseous” ion chemistry. The same can't be said about peptide RP-LC: the very basic principles of separation are still unknown despite years of intensive study and application. The separation process is often viewed in a simplified form as “catch and release” of peptide species when the critical concentration of organic solvent is reached. The real picture, however, is much more complex: under gradient conditions, peptides are constantly “on the move” with different accelerations which are based on the intrinsic molecular features encoded in the slopes S in the basic equation of the linear-solvent strength (LSS) theory.17 Separation selectivity is affected by the value of slope S in the basic LSS equation:log k=log k0−S*φ;  (1)where k is the retention factor at an organic solvent volume fraction φ (such as φ=ACN %/100) and k0 is the retention factor at φ=0.
Peptides can exhibit unexpected and generally unpredictable changes in relative peptide retention when the physical parameters of a LC system (gradient slope, flow rate, column size) are altered. For example, running identical samples with a 4-times difference in gradient slope (for example, 1% and 0.25% acetonitrile per minute) will change retention time correlation from the ideal 1.00 to a ˜0.99 R2-value. Calculations suggest that retention time vs. retention time correlations of ˜0.95 and ˜0.92 will be observed for 32-x and 100-x changes in the gradient slope, respectively. Some species will even change their retention order. Such a dramatic variation in separation selectivity threatens to make the application of retention time prediction protocols, the transfer of scheduled MRM(SRM) procedures between LC systems, and inter-laboratory data collection and comparison very problematic.
Classical LSS theory suggests a direct correlation between slopes S in the basic LSS equation and the molecular weight of peptides and proteins17. This theory, however, doesn't work for the typical peptide mixtures that proteomics researchers are dealing with; the suggested formula S=a(MW)b gives at best an R2-value correlation of ˜0.3. Dealing with real tryptic peptides introduces significant variability in peptide structures, which strongly affects the accuracy of predictions made using this model.
In chromatography, retention times represent the affinity of peptides to the stationary phase; the precise calculation of these affinities has proven to be a very complicated task. So far attempts have been limited mostly to RP-HPLC, where retention correlates linearly with peptide hydrophobicity. It was postulated in early 1980's that peptide hydrophobicity could be calculated as a sum of hydrophobicities of the constituent amino acid residues.4 Several similar models were developed,4-6 some of which featured introduction of correction factors for peptide length. These additive approaches remained state-of-the-art until around 2004, despite compelling evidence that peptide retention in RP-HPLC should also possess sequence-dependent features.7 The situation changed dramatically with the development of new ionization techniques for biological macromolecules, such as ESI and MALDI, accompanied by rapid improvements in new mass measurement techniques. Abundant data sets of peptides with their measured retention times became available, rejuvenating the interest in peptide retention modeling. Several research groups have used proteomics-derived data to develop peptide retention prediction models.8-13 While the typical additive models were able to reach correlation of experimental vs. predicted retention times of ˜0.90, the best sequence-specific models have showed ˜0.97-0.98 correlations.8,9 
Despite the progress in modeling peptide retention in RP HPLC, some fundamental challenges still remain unanswered. Retention prediction algorithms have generally been optimized for a specific set of chromatographic conditions: the type of the sorbent, the ion-pairing modifier, column size, flow-rate, gradient slope. Previously, there have been no quantitative models developed for predicting S for peptidic compounds. This may be due in part to peptidic compounds being in a category of “irregular compounds” from the point of view of LSS theory.22 Peptides exhibit significant not predictable variation of S and resulting separation selectivity in reverse phase chromatography. Understanding the factors that control the retention of peptides in reverse phase chromatography, such as S, will result in improved separation selectivity and methods for the analysis and isolation of peptides.
Accordingly, there is a need for improved methods and compositions for predicting S and separating peptides using RP-HPLC.