Determining the amino acid sequence, i.e., primary structure, of a peptide is central to understanding the structure of the peptide, as well as to manipulating the peptide to achieve desired properties in a modified or altered form. In addition, the amino acid sequence of a peptide is useful in a variety of recombinant DNA procedures for identifying the gene coding sequence of the peptide, for producing the peptide recombinantly, and/or for producing site-specific modifications of the peptide.
Methods for use in N-terminal sequencing are well known (e.g., Edman). Despite the relative ease and reliability of N-terminal sequencing methods, it is often desired to obtain C-terminal amino acid sequence information which may be inaccessible or only obtained with difficulty by this method. Information about the carboxy terminal sequence may be useful for certain types of recombinant DNA procedures, particularly since the C-terminal end of the coding region of a protein corresponds to the end closest to a poly A tail, which is likely to be present in CDNA clones.
Three general approaches have been proposed for C-terminal peptide sequencing: enzymatic, physical, and chemical. These methods and their inherent limitations have been summarized in the above-cited parent application. Briefly, neither enzymatic nor physical determinations have proven satisfactory to date. In view of this, considerable effort has been invested in developing chemical methods for determining C-terminal amino acids residues, and for C-terminal sequencing. An inherent difficulty in C-terminal sequencing is the relatively poor reactivity of the carboxyl group, in contrast to the relative ease of addition at the N-terminal amino group.
Of the reaction methods which have been proposed for C-terminal sequencing, three have received special attention.
The first method involves generating a carboxyamido derivative at the C-terminal end of the peptide, followed by reaction with bis(I,I-trifluoroacetoxy)iodobenzene, to form a derivative which rearranges and hydrolyses to a shortened carboxyamidopeptide and the aldehyde derivative of the C-terminal amino acid (Parham). The method has been successfully carried out only to 3-6 cycles before the reaction halts. In a second, related approach, the carboxy terminus is reacted with pivaloylhydroxamate to effect a Lossen rearrangement. One limitation of the method is that the chemistry does not degrade aspartic and glutamic acid residues (Miller, 1977).
The most widely studied of the C-terminal chemistries is the thiohydantoin (TH) reaction. In one general method for carrying out the TH method, the carboxyl group is activated with an anhydride, such as acetic anhydride, in the presence of an isothiocyanate (ITC) salt or acid, to form a C-terminal peptidyl-TH via a C-terminal ITC intermediate (Stark, 1972). The peptidyl-TH can be cleaved to produce a shortened peptide and a C-terminal amino acid TH, which can be identified, e.g., by high pressure liquid chromatography (HPLC). The coupling conditions in this method typically require about 90 minutes at a 60.degree. C.-70.degree. C. (Meuth), and often lead to degradation of some of the amino acid side chains in the peptide. Further, the anhydride reagent is relatively unstable, and therefore presents storage problems.
A C-terminal TH sequencing method which can be carried out under milder conditions has been described by one of the inventors and co-workers (Hawke). Using trimethylsilyl isothiocyanate (TMSITC) as the reagent, TH formation was achieved by activation of the peptide with acetic anhydride for 15 min at 50.degree. C., followed by reaction with TMS-ITC for an additional 30 min at 50.degree. C. The method suffers from the disadvantage, noted above, of peptide exposure to a highly reactive anhydride activating agent. In addition, and like the related TH-generating methods described above, the TH-amino acid reaction products are racemized, and thus the method cannot be used to distinguish D- and L-form amino acids.
The C-terminal sequencing methods involving TH formation just described commonly lead to racemized products. A modification of the C-terminal reaction employing phosphoryl isothiocyanatidate reagent has been proposed (Kenner). Although TH was produced, the reaction was too slow to be very useful. Miller et al have proposed a related method, but using a mercaptobenzothiazole derivative. The rationale for using this compound is that cyclization could occur with concomitant opening of the thiazole ring.
In co-pending parent application for "Method of C-Terminal Peptide Sequencing" there is disclosed an improved C-terminal sequencing method which (a) is relatively rapid, (b) can be carried out under mild reaction conditions, and (c) under acidic peptidyl TH cleavage conditions maintains the stereochemistry of the C-terminal amino acid. In the disclosed method, the peptide is reacted with a mixed anhydride of isothiocyanic acid and a carboxylic, carbonic, or sulfonic acid, preferably a carboxylic acid, under basic conditions, to produce a C-terminal peptidyl TH. Subsequent hydrolysis of the reaction product releases the amino acid TH, which can then be identified as an amino acid TH adduct. The residual peptide can be recycled through the method steps, for successive C-terminal sequence determination.
One feature of the just-described method is that the C-terminal amino acid TH must be purified from solution-phase reactants and byproducts, such as mixed anhydride reactant, the organic acid byproduct of thiohydantoin formation, the cleavage reagent and its byproducts.