This invention relates to methods for the determination of the amino acid sequence of polypeptides and proteins, and equipment for making such determinations.
The most widely used method of protein sequence analysis is the Edman degradation for the sequential removal of amino acid residues. In this scheme, amino acids are removed from the N-terminal of the peptide in a two-step chemical process. The operation for one cleavage is illustrated below. ##STR1##
In the first step an activating group, termed a coupling reagent and illustrated by phenylisothiocyanate in the above diagram, is attached to the free amino group of the N-terminal amino acid of the polypeptide, the sequence of which is to be determined. This step is called coupling and is carried out in a buffer containing a coupling base at high pH (pH 8-9). The Edman process typically uses phenylisothiocyanate (PITC) as a coupling reagent. Other reagents such as methylisothiocyanate or penta-fluorophenylisothiocyanate have been used.sup.1,2. The function of the coupling reaction is to make the peptide, bond between the first and second residues more easily acid hydrolyzed than any of the other peptide bonds in the protein. After removal of excess coupling reagent and buffer, the second step is the addition of a cleavage reagent, anhydrous acid, to hydrolyze this activated peptide bond The cleaved amino acid derivative can then be extracted with a suitable organic solvent. The residual peptide with the newly formed N-terminal is left behind for subsequent cycles. The extracted derivative contains information on the identity of the initial N-terminal residue since the amino acid is incorporated in its structure. By differentiating the twenty or so derivatives on the basis of their side chain (R.sub.1), the derivative formed after each cleavage can be identified and the amino acid ascertained. If this process is repeated, each subsequent residue can ideally be determined. However, it is not practical to carry out repetitive chemical reactions indefinitely, since the coupling and cleavage reactions never attain 100% yield Although the coupled N-terminal peptide bond is more susceptible to acid hydrolysis than any other bond, random cleavage can and does occur.
The Edman process has been used in manual methods and in automated methods for amino acid sequence determination.
The manual procedures are most frequently used for sequence determinations of small peptides on short sections of proteins or when the cost of an automated sequencer cannot be justified. Many such methods have been reported.sup.3,5. Most approaches first apply the protein or peptide to a support such as a paper strip. After the sample is dried, phenylisothiocyanate in a solvent-buffer system (i.e. dioxane or pyridine, etc.) is introduced to the immobilized peptide The coupling reaction may take several hours at 40.degree.-50.degree. C. for completion. It is important at this step that oxygen be excluded to prevent blocking of the N-terminal via side reactions. After coupling is complete, the excess reagents (PITC, etc.) and byproducts (i.e. diphenyl-thiourea) are removed without loss of PTC (phenylthiocarbamyl)-peptides. Several solvent systems have been suggested for this step (.e.g benzene alcohol-ether).sup.6. Extraction with benzene alone to remove these byproducts is slow but it does not remove coupled peptides. Either ethyl acetate or an alcohol-ether mixture is better for removing the byproducts but these will also extract small hydrophobic peptides.
After the first wash the PTC-peptide is cleaved into the thiazolinone amino acid and a free peptide. Since internal peptide bond cleavage can occur under aqueous conditions.sup.7, most procedures call for anhydrous acid, such as trifluoroacetic or heptafluorobutyric acid. During this step the reaction is carried out at a lower temperature than when coupling, and water is excluded from the sample chamber. The cleavage of coupled residues is more difficult with prolyl or glycyl residues, and these may require a higher temperature or a longer reaction period. Overly vigorous hydrolysis conditions at this point can lead to spurious cleavage of internal peptide bonds.
In the final step, the 2-anilino-t-thiazolinone amino acid (ATZ) is extracted with benzene and ethyl acetate. The phase transfer is quantitative for most amino acid derivatives except ATZ-Arg and ATZ-His. Ethyl acetate alone will give better extraction of ATA-Arg and ATZ-His but may also extract small hydrophobic peptides. Acetone is a satisfactory compromise if all traces of water and the acid cleavage reagent are removed earlier by drying under vacuum. The extracted ATZ-amino acid is unstable and must be converted to the stable PTH (3-phenyl-2-thiohydantoin) form by aqueous hydrolysis. The method of Edman is generally used. The conversion reaction consists of hydrolysis of the thiazolinone to the PTC-amino acid intermediate followed by rearrangement to the PTH form. The benzene/ethyl acetate extract is evaporated to dryness under a stream of nitrogen and then dissolved in dilute HCL. The temperature is quickly brought to 80.degree. C. and maintained for 10 minutes, then lowered. The solution is dried and dissolved in a small volume of buffer, whereafter the PTH-amino acid derivative is analyzed.
In general, amino acid sequence determinations are made by automated methods in equipment dedicated to that purpose. The chemistry employed in such automated methods is basically the same as that used in the manual procedure. Present automatic sequencers are based on either the liquid phase (spinning cup) or phase designs. In the liquid phase instruments, protein sample is spread out as a thin film on the inner wall of a rotating reaction cup. The protein is immobilized while liquid Edman reagents introduced into the reaction cup at the bottom move up over the protein film by centrifugal force. Liquids are removed from the top of the cup by means of a scoop protruding into a groove around the top of the cup.
A description of the spinning cup sequencer is given in the original paper by Edman and Begg.sup.7. In the operation of such sequencers a solution of the sample is introduced into the cup and dried under vacuum while the cup is turning, thus forming a thin film on the lower walls of the cup. Sample size is generally around 100 to 300 nanomoles of sample dissolved in about 500 microliters of the appropriate solvent. After the sample has been dried the automatic cycle is started.
The first step is the introduction of coupling reagent (5% PITC in heptane) and buffer into the spinning reaction cup. The buffer generally contains N,N-dimethyl-N-allylamine (DMAA) to maintain the alkaline pH needed for the coupling reaction. A suitable buffer containing DMAA and a detergent is sold under the trademark Quadrol. The coupling mixture spreads out over the Protein film and dissolves it. The ensuing reaction proceeds for about 20 minutes at 55.degree. C. After partial removal of PITC and solvent by vacuum, the coupling reaction is stopped by the introduction of benzene. The benzene precipitates the protein and carries off the excess PITC reagent and some of the breakdown products of PITC. If Quadrol is used as the buffer, the cup is washed with ethylacetate to remove excess buffer and more of the breakdown products. After vacuum drying the protein remains in the cup as a white film
Anhydrous heptafluorobutyric acid (HFBA) is added to initiate cleavage. The volatile HFBA covers and dissolves the protein film and after only two to three minutes the N-terminal amino acid is cleaved as the anilinothioazolinone derivative. Finally, the remaining HFBA is removed by vacuum, then the released ATZ-amino acid is extracted with butyl chloride and delivered to a fraction collector. A new residue is released to the fraction collector with each cycle of the above procedure.
The collected fractions of ATZ-amino acids now represent the sequential order of amino acid residues comprising the peptide or protein sample. The fractions can be converted to the more stable PTH-amino acid products The solution is heated for 10 minutes in 1.0M HCl at 80.degree. C. or 25% TFA (trifluoroacetic acid) in H.sub.2 O at 60.degree. C. After removal from heat all PTH-amino acid derivatives except PTH-Arg and PTH-His can be extracted with ethyl acetate. Liquid chromatography analysis at this stage is advantageous since there is no need to separate the two phases: All PTH-amino acids present can be determined in a single injection. For preconcentration purposes, the fraction is usually taken to dryness at low temperature prior to the chromatography.
The spinning cup sequencer suffers from the disadvantages of requiring the delivery of precisely calibrated reagent quantities, else protein is easily washed from the cup, the protein must be continuously cycled through successive precipitations and resolubilizations, leading to protein loss and denaturation, and extenders such as Polybrene or blocked proteins are often required to aid in the precipitation of the test sample. The disadvantage of proteinaceous extenders is that they are frequently hydrolyzed during cycles of Edman degradation. These hydrolyzed extenders contain free amino termini that are sequenced along with the test sample, thereby introducing interfering residues into the determination.
Automatic solid-phase sequencers perform the Edman degradation on peptides in much the same way as liquid-phase systems, except that the peptide is immobilized by covalent attachment to a solid support material and does not undergo cycles of solubilization and precipitation Reagents and solvents are undirectionally pumped through a column of bound peptide as required. In this type of sequencer the sample peptide first must be covalently linked to the support material. Several methods have been reported for achieving this task. The most reliable coupling procedures utilize the .epsilon.-amino group of lysine.sup.10 or a C-terminal homoserine.sup.15. Coupling yields are usually up to about 80% but the peptide must contain lysine or a C-terminal carboxyl group.sup.16. The two types of solid supports for covalent coupling generally used are polystyrene.sup.17 and porous glass.sup.13. Both are highly substituted with functional groups and inert to the reagents and solvents used in sequencing. Small peptides containing lysine are usually attached to aminopolystyrene by the diisothiocyanate coupling procedure. Peptides without lysine are attached to triethylenetetramine resin by carboxyl activation. Large peptides and proteins are affixed to amino glass supports after activation with diisothiocyanate.sup.18.
After peptide attachment the resin is washed and packed into a small glass column. The reaction column is then placed into a heated holder in the sequencer From this point, the solid-phase instrument follows much the same chemical procedure as the manual and spinning cup methods except that a wider range of reagents, buffers and solvents can be passed through the column without fear of washing out the covalently bound peptide. The routinely used solid-phase sequencing chemicals are: PITC (5% V/V in acetonitrile), pyridine: N-methylmorpholiniuatrifluoroacetate buffer (2:3 V/V), and trifluoroacetic acid. Ethylene dichloride and methanol are used as solvents. Fractions of the ATZ-amino acids are collected in a fraction collector and later converted to the PTH-amino acid derivative either manually or automatically as described before.
The solid phase sequencer using covalent immobilization of the test protein has never achieved widespread commercial acceptance. This is predominantly the result of the nature of the covalent immobilization, which requires specialized conditions for each polypeptide and results in protein losses.
The gas phase sequencer is related to the solid phase sequencer in that it uses preimmobilized polypeptide. However, rather than avoiding peptide loss by covalent immobilization, this system uses a gaseous form of the alkaline buffer coupling reagent to avoid elution of non-covalently adsorbed polypeptide. The gas phase sequencer has enjoyed considerable commercial success, supplanting both the spinning cup and solid phase sequencers.
An early version of a gas phase sequencer is described in U.S. Pat. No. 4,065,412. A commercial sequencer based upon the system described in that patent is sold by Applied Biosystems of Foster City, Calif. In that system, the protein or peptide is noncovalently deposited on a glass fiber disc which contains a protein extender (Polybrene). The protein and extender form an immobilized film in the glass fiber disc which is held in a small glass chamber. Gas and liquid Edman reagents enter through a small opening at the top of the chamber and exit through the bottom.
The coupling reagent is added in an organic solvent (heptane) that will not dislodge the peptide. The coupling reaction occurs after wetting of the entire surface of the glass disc with the coupling reagent solution and drying off the organic solvent. The reaction is started by introducing the gaseous coupling base, trimethylamine (TMA) The vapor stream of coupling base and water vapor increases the pH of the protein film. In contrast to the spinning cup sequencer, the sample chamber is small and simple. Since there is no liquid buffering solution, certain peptides may be sequenced without covalent attachment. However, this requires that the coupling reagent be added in an organic solvent and that the coupling base be introduced in a separate step. Furthermore, a disadvantage of using the gaseous coupling base is that the reaction is not as easily controlled as with a liquid buffer solution In solution, the optimum pH is approximately 9.0. At a pH higher than 9.5, the coupling reagent begins to react more rapidly with water to form byproducts (anilide and diphenylthiourea). At higher pH levels, breakdown of the peptide or protein can become a problem as set forth in the aforementioned U S. Pat. No. 4,065.412. To effectively control the pH on the reaction surface, the flow rate of the gaseous phase must be precisely controlled as well as the concentration of base (e.g., TMA) in the gaseous atmosphere. This requirement for precise control of flow rates and concentrations results in a very complex and expensive instrument that requires highly skilled operators. Total instrument temperature control is needed to ensure precisely calibrated reagent aliquots.
Another disadvantage of the gas phase system is the requirement that Polybrene (e g., in amounts of 1.2 mg) be used to retain the protein on the small glass disc. However, Polybrene also retains byproducts more efficiently. It has been reported that covalently linked peptides when sequenced in a gas phase sequencer without using Polybrene produce much less of the byproduct peak.sup.19. Large amounts of byproduct peaks can obscure the identification of some amino acid derivatives.
Another disadvantage of performing the reaction on the surface of the gas phase sequencer with little if any aqueous solvent is that small amounts of salts, denaturants such as urea, or buffer ions deposited from the test sample can interfere with the reaction of the alpha-amino of the N-terminal residue, or interfere with the solvent extraction of amino acid derivatives for identification or the washing out of undesired byproducts.
Another disadvantage of the gas phase sequencer is that after the coupling reaction is complete, all remaining water vapor must be removed by an inert gas drying. Then the byproducts are removed by flowing an organic solvent through the disc holding chamber. It is important that the flow of solvent be precisely controlled so as not to dissolve or dislodge any immobilized protein. Since the flow is in one direction only and there is no film reforming step in the process, any dissolved peptide is lost in the wash.
It has been conventional for quite a long time to prepare test samples for amino acid sequencing by separating polypeptides from one another or from contaminants through the use of dialysis membranes or high pressure liquid chromatography. However, such procedures have not been incorporated into amino acid sequencing devices, and in fact are considered undesirable because they result in the loss of sample.
Accordingly, it is an object of the invention to provide a sequencer which is less expensive than the sequencers of the prior art in not requiring finely calibrated fluid or gas delivery systems or complete instrument temperature control.
It is a further object of the invention to provide a sequencer which is simpler to use than the systems of the prior art, thereby enabling unskilled persons to operate the sequencer.
It is another object to provide a sequencer capable of a higher percentage repetitive yield than has heretofore been available, and thus which is capable of being employed for a larger number of cycles than prior art systems.
It is another object of the invention to provide a sequencer capable of use with contaminated peptides which have not been previously purified. This avoids protein losses in sample preparation and recovery procedures heretofore used with conventional sequencers. This means that over the combined process of sample preparation and sequencer operation far less polypeptide is required to obtain amino acid sequence.
Another object herein is to provide a system that requires a substantially lower cycle time by optimizing the reaction conditions for and selection of Edman reagents, and which provides for more precise control of the coupling reaction than is possible with commercially available gas phase sequencers.
It is an additional object to provide a system for performing sequencing chemistry in an inexpensive multiple column sequencer for simultaneously determining amino acid sequences on a plurality of test samples, i.e. it is the objective to be able to use inexpensive turret valves with low fluid delivery tolerances in sequencers.
It is still a further object to provide a system for amino acid sequencing wherein the polypeptide is not so denatured or modified as to be insoluble in aqueous reagents. Polypeptides that are allowed to freely associate with water can be more accurately sequenced because the amino terminus is not potentially folded into an insoluble matrix and thus is not inaccessible to the sequencing reagents.
Another object is to dispense with the expense and difficulty of working with protein extenders such as Polybrene.
Another object is to provide a device for amino acid sequencing in which the advantages of temperature control during amino acid sequencing reactions can be fully realized.
An additional object is to provide an amino acid sequencer having replacement sample chamber cassettes for convenience and ease of use.
Further objects and features of the invention will be apparent from the following description taken in conjunction with the accompanying drawings.