Recent developments of new technologies provide libraries of peptides attached to solid phase supports, expressed by bacteria, or in solution for biological testing (Lam et al., 1991, Nature 354:82; Parmley et al., 1988, Gene 73:305; Scott and Smith, 1990, Science 249:386; Cwirla et al., 1990, Proc. Natl. Acad. Sci. U.S.A. 87:6378; Devlin et al., 1990, Science 249:404; Houghten et al., 1991, Nature 354:84; Fodor et al., 1991, Science 251:767; and Furka et al., 1991, Int. J. Peptide Protein Res. 37:487). Evaluation of peptides selected from such libraries requires rapid and efficient methods of peptide sequencing on the picomole level. Presently, Edman degradation (Niall, 1973, Methods Ezymol. 27:942) is the only widely used practical method for the direct determination of the amino acid sequence of polypeptides. However, in the past few years mass spectrometry has been proven to be a powerful and sensitive tool for peptide sequencing and is becoming a more and more useful alternative or complementary approach (Carr et al., 1991, Anal. Chem. 63:2801; Papayannopoulos and Biemann, 1992, Peptide Res. 5:83). Generation of mass spectra which contain necessary information for sequencing is not difficult and typical fragmentation pathways both for fast atom bombardment (FAB) and electrospray (ESI) collissionally induced dissociation have been characterized (Roepstorff and Fohlman, 1984, Biomed. Mass Spectrom. 11:601; Biemann, 1988, Biomed. Environ. Mass. Spectrom. 16:99). But sequence determination of an unknown peptide from mass spectral data is still a difficult task due to the huge number of possible sequences consistent with molecular weight (MW) of peptide, from which the correct one must be chosen by using spectral information about fragment ions and additional data (if any) about the peptide.
The recent advances in peptide synthesis discussed above allow generation of libraries of thousands to millions of peptide sequences. One advantage of such libraries is that non-natural amino acids can be incorporated in the peptide sequence. Such non-natural amino acids may not be amenable to Edman degradation. Thus, sequence determination of such peptides proceeds most readily by mass spectrometric methods. However, the present state of peptide sequencing by mass spectrometry remains imperfect.
There are two main approaches in sequence elucidation of peptides using mass spectrometry: (1) generation of all possible sequences consistent with the molecular weight of the peptide as the first step, with subsequent removal of those which are not consistent with experimental fragment ions (Matsuo et al., 1981, Biomed. Mass Spectrom, 8:139; Sakurai et al., 1984, Biomed. Mass Spectrom, 11:396; Hamm et al., 1986, Computer Appl. Biosci. 2:115); and (2) generation of all possible two to three membered subsequences and extension of these subsequences by one or more amino acids, either from the N- or C-terminus, such that only those subsequences which account for the greatest number of observed fragment ions are saved on every step (Ishikawa and Niwa, 1986, Biomed. Environ. Mass Spectrom. 13:373; Siegel and Bauman, 1988, Biomed. Environ. Mass Spectrom, 15:333; Johnson and Biemann, 1988, Biomed. Environ. Mass Spectrom 18:945; Bartels, 1990, Biomed. Environ. Mass Spectrom. 19:363; Scoble et al., 1987, Fresenius' Z. Anal. Chem. 327:239; Yates et al., 1991, Techniques in Protein Chem. 2:477; and Zidarov et al., 1990, Biomed. Environ. Mass Spectrom. 19:13). In the first approach, invalid sequences are removed on the final step of analysis, whereas the second approach uses spectral information to limit the number of possible subsequences on every step. In both cases deduction of the amino acid sequence becomes easier if additional information about the peptide is available. Thus, as discussed by Matsuo et al., supra, information concerning the kind and number of amino acids decreases dramatically the number of compositions, and sometimes a unique composition can be found. A correct answer for the sequence was obtained in each case when amino acid composition was used as an input data in an algorithm of Ishikawa and Niwa (1986, Biomed. Environ. Mass Spectrom. 15:333). Unfortunately, a combination of mass spectrometry with other techniques such as amino acid analysis, chemical derivatization, etc., is time consuming. Moreover, these analytic methods become very difficult when analysis must be carried out on picomoles of peptide.
It is an object of the present invention to provide a method for eliminating candidate peptide amino acid compositions or sequences elucidated by mass spectrometry by eliminating sequences that do not contain an observed number of exchangeable protons.
It is another object of the present invention to provide a method for using hydrogen-deuterium exchange to reduce the number of amino acid composition or sequence possibilities of a peptide of a particular mass.
It is a further object of the invention to provide a method for determining the composition or sequence of a peptide. Yet another object of the invention is to provide a method for sequencing a peptide that cannot be sequenced by Edman degradation.