One of the primary goals of protein chemists is to relate the function of a protein to its structure. With this goal in mind, an early step in the structural characterization of proteins is the determination of primary structure or sequence. Currently, primary structural determination can be accomplished either by sequencing the protein on an automated sequencer using the Edman chemistry for successive degradation or by sequencing the gene for that protein using established DNA sequencing methodology. Although protein sequencing can be considered to be more difficult and slower than DNA sequencing, it often provides information not obtainable by the latter method. Protein sequencing can provide information concerning post translational modifications which are not predictable from the gene sequence, such as location of proteolytic cleavage sites. Furthermore, it is a key method for the determination of protein sequence information which can be used for the design of oligonucleotide probes complementary to predicted gene sequences. In many cases, these oligonucleotide probes, obtained from protein sequence analysis, have been the only route to the cloning of a particular gene.
Currently, protein sequence analysis is primarily accomplished with the use of an automated sequencer using chemistry developed by Edman over forty years ago (1) (FIG. 1). Since that time, improvement in the instrumentation (2, 3) has resulted in the ability to sequence smaller and smaller sample quantities (mmole to pmol), although the original chemistry has remained essentially unchanged. Current automated instrumentation permits 10-20 cycles of sequence determination on 10-50 pmol of sample.
Advances in protein isolation methodology have recently made it possible to isolate proteins of biological interest which are present in tissues in sub-picomole quantities. Techniques such as 1- and 2-dimensional electrophoresis (with electroblotting to membranes), microcolumn liquid chromatography, and capillary electrophoresis have allowed protein and peptide purification down to the 10-100 femtomole level. Many of these proteins have been shown to have key roles in the development and treatment of human disease. Improved methods of protein sequencing requiring less sample quantity would make it possible to obtain the necessary sequence information in order to clone and express these proteins, thereby making it possible to study the structure function aspects of these important proteins. It is generally anticipated that this information could set the stage for advances in the treatment of human diseases through rationalized drug design and gene therapy.
A major limitation to increasing the sensitivity of protein sequencing down to the femtomolar level involves the intrinsic detectability of the released PTH amino acids. The PTH amino acids, which are detected by absorption at 269 nm, have relatively low extinction coefficients. Intrinsic background noise associated with absorbance measurements at this wavelength and chemical background from the reagents used in sequencing also contribute to the limit of detection. Although a recently published method involving the use of absorbance detection with capillary electrophoresis rather than HPLC for separation of the PTH amino acids has shown femtomolar detection (4), this technique requires subnanoliter injection volumes. Current automated sequencer technologies dissolve the PTH amino acids in a 50-200 .mu.l volume for injection. Use of only a small fraction of this volume would negate any value in the increased sensitivity of detection using capillary electrophoresis.
Numerous attempts have been made to increase the sensitivity of Edman degradation through the use of radiolabeled, chromophoric, or fluorescent isothiocyanate reagents. 4-(N,N'-dimethylamino)azobenzene-4'-isothiocyanate (DABITC), a highly chromophoric reagent first described by Chang et al. (5) has primarily been used as a manual sequencing reagent with a DABITC/PITC double coupling procedure (6), although it has been used in automated solid-phase sequencing (7). More recently, Aebersold et al. (8, 9) reported a DABITC solid-phase sequencing method in which proteins were immobilized on DITC-derivatized aminopropyl glass-fiber sheets. Sequence analysis was performed at the 20-50 picomole level, a substantial improvement over previous methods, but sill less sensitive than current gas-phase sequence analysis. Fluorescent reagents, such as fluorescein isothiocyanate (10, 11) and dansyl-containing isothiocyanates (12-16) have also been evaluated as sensitivity enhancing reagents. Although synthetic amino acid derivatives prepared using these reagents show subpicomole sensitivity by HPLC analysis, they have not surpassed the sensitivity of gas-phase Edman degradation during automated sequence analysis. In general, it has been found that the use of large bulky chromophores on the isothiocyanate reagent interferes with the efficiency of the derivatization and cleavage reactions of the Edman degradation. The inhibition of the coupling and cleavage reactions with these reagents is postulated to be caused by a combination of steric and electronic effects. The use of radiolabeled reagents has also proven not to be successful, since radiolabeled reagents undergo autoradiodegradation which results in decreasing product yields and increasing amounts of labeled by-products. Modified phenyl isothiocyanates such as 4-(Boc-aminomethyl)-PITC, which are designed to react with post-column fluorescent reagents, have also been investigated (17) but have been found to undergo side reactions during the cleavage reaction resulting in loss of the amino group (14).
An alternative to the use of modified Edman reagents is the reaction of the anilinothiazolinone (ATZ)-amino acid intermediate with sensitivity-enhancing nucleophilic reagents. The use of radiolabeled amines produced amino acid derivatives which could be detected at the femtomole level (18, 19), but the handling of radioactive materials was inconvenient. Horn et al. (20) have extended earlier studies on the use of MeOH/HCL as a conversion reagent (21) to include chromophoric or fluorophoric alcohols, resulting in the formation of phenylthiocarbamyl amino acid esters. Tsugita et al. (22) have recently reported a modification of the Edman degradation scheme, in which ATZ amino acids are reacted with 4-aminofluorescein resulting in highly fluorescent, phenylthiocarbamyl amino acid aminofluorescein amides (PTCAF-amino acids) (FIG. 2). PTCAF-amino acids were separated by reversed-phase HPLC and were detectable at the 0.1-1 femtomole level. Several known and unknown protein samples were reported to be sequenced at the 100 femtomole to 10 picomole level using an Applied Biosystems 477A sequencer. An experiment based on the chemistry shown in FIG. 2 resulted in the data reflected by FIG. 3.
Sequencing of .beta.-lactoglobulin and a synthetic peptide was performed at the 5-10 picomole level. 5.5 picomoles of .beta.-lactoglobulin was spotted on a 1.times.10 mm piece of Polybrene-coated PVDF and inserted into a reaction cartridge for sequence analysis. Approximate initial and repetitive yields of 50% and 96%, respectively, were observed. Poor yields were seen for threonine (cycles 4 and 6). Aspartic acid (cycle 11) was not observed (it is shown as 0.1 picomoles for graphing purposes only). The computer generated line in FIG. 3 appears to have the correct slope although it is placed rather low on the graph. This is most likely due to the low yields on the threonine and aspartic acid cycles. One major background peak (.about.3 picomole equivalents) was seen with several minor background peaks of &lt;1 picomole equivalent. By extrapolation, the most sensitive detector setting would be expected to permit sequence analysis at the 10-100 femtomole level.
Conclusions from this work show that this approach suffers from a number of problems which make it of little practical value toward the goal of more sensitive sequencing. The most serious difficulty with this method, in its present form, is the low yields obtained with the hydrophilic amino acids, in particular threonine, histidine, glutamate, lysine, and glutamine, and the total lack of yield obtained with aspartate. Recent studies concerning the aminolysis of the ATZ-amino acids by Pavlik et al. (23), showed that many of the ATZ-amino acids, in particular the hydrophilic amino acids, can rearrange so rapidly to the more thermodynamically stable PTH amino acids that by the time the ATZ-amino acid is brought over to the conversion flask of an automated instrument anywhere from 5-70% of the amino acid has already been converted. Once an ATZ-amino acid has converted to a PTH it would not be capable of reacting with aminofluorescein. This explanation is consistent with observed data. By analogy with the data presented above, it is anticipated that any chemical scheme that relies on tagging the ATZ analogue with a fluorescent molecule, such as reaction of the ATZ analogues with alcohols (20), will not offer any practical gains in the sensitivity of N-terminal microsequencing.
The present status of microsequence analysis, protein and peptide purification, and other related techniques was the subject of a recent review (24) and a number of recent monographs (25-30). It was concluded that improvement in analytical techniques such as microsequence analysis was necessary in order to match the capability of the purification methods. Improvements in microsequence analysis could have far reaching effects. For example, the ability to detect differences in complex biological samples such as cerebrospinal fluid by 2D-electrophoresis (31) and also obtain meaningful sequence information could be of great importance in understanding various pathological states.