Considerable experimental work and time are required to precisely characterize the structure of a polypeptide of interest. In general, the techniques that are the easiest to use and which give the quickest answers, result in an inexact and only approximate idea of the nature of the critical structural features. Techniques in this category include the study of proteolytically generated fragments of the protein which retain binding function; recombinant DNA techniques, in which proteins are constructed with altered amino acid sequence (for example, by site directed mutagenesis); epitope scanning peptide studies (construction of a large number of small peptides representing subregions of the intact protein followed by study of the ability of the peptides to inhibit binding of the ligand to receptor); covalent crosslinking of the protein to its binding partner in the area of the binding site, followed by fragmentation of the protein and identification of cross-linked fragments; and affinity labeling of regions of the receptor which are located near the ligand binding site of the receptor, followed by characterization of such “nearest neighbor” peptides.
Other techniques that are capable of finely characterizing polypeptide three-dimensional structure are considerably more difficult in practice. The most definitive techniques for the characterization of polypeptide structure, and receptor binding sites in particular, have been NMR spectroscopy and X-ray crystallography. While these techniques can ideally provide a precise characterization of relevant structural features, they have major limitations, including inordinate amounts of time required for study, inability to study large proteins, and, for X-ray analysis, the need for protein and/or protein-binding partner crystals.
A critical shortcoming of present high-throughput crystallographic structure determination efforts is the failure to produce crystals for around 80% of the proteins of interest. It is clear that advances in automation and crystallography data analysis have not been matched by a similar pace of progress in methods for generating protein crystals for analysis (Chayen and Saridakis, Acta Crystal. D. Biol. Crystal. 58:921-927, 2002). The process of generating protein crystals suitable for structural analysis is commonly recognized as the most difficult and time-consuming step in the process of a crystallographic structure determination (see, e.g., Wiencek, Ann. Rev. Biomed. Eng. 1:505-534,1999). Floppy, unstructured regions of proteins can play a dominant role in this problem; the energetics and kinetics of crystallization are often less favorable than for fully structured proteins, and additionally, these regions are often more susceptible to degradation during purification than are structured regions, thus promoting sample heterogeneity.
Measurement of the exchange rates of peptide amide hydrogens within a protein can report its stability at the individual amino acid scale. Essentially, hydrogen exchange can be used to determine a stability map of a protein, reflecting the degree of ordered conformation of all regions of the protein being analyzed. Ranking and comparison of the exchange rates of a protein's amide hydrogens therefore allows direct identification and localization of structured versus unstructured regions of the protein.
Accordingly, there is considerable advantage in producing modified forms of proteins of interest that contain structured regions in their native conformation, but have unstructured regions modified or removed (in part, or in whole). Thus, there remains a need in the art for a robust technique to discern structured versus unstructured regions of proteins of interest at the pace required for high-throughput crystallographic structure determination.
Hydrogen (Proton) Exchange
When a protein in its native folded state is incubated in buffers containing an isotope of hydrogen (for example, tritium or deuterium labeled water), isotope in the buffer reversibly exchanges with normal hydrogen present in the protein at acidic positions (for example, —OH, —SH, and —NH groups) with rates of exchange which are dependent on each exchangeable hydrogen's chemical environment, temperature, and most importantly, its accessibility to the isotope of hydrogen present in the buffer (see, e.g., Englander et al., Meth. Enzymol. 49:24-39,1978; Englander et al., Meth. Enzymol. 26:406-413, 1972). Accessibility is determined in turn by both the surface (solvent-exposed) disposition of the hydrogen, and the degree to which it is hydrogen-bonded to other regions of the folded polypeptide. Simply stated, an acidic hydrogen present on amino acid residues which are on the outside (buffer-exposed) surface of the protein and which are hydrogen-bonded to solvent water will often exchange more rapidly with heavy hydrogen in the buffer than will a similar acidic hydrogen which is buried and hydrogen-bonded within the folded polypeptide.
Hydrogen exchange reactions can be greatly accelerated by both acid and base-mediated catalysis; and the rate of exchange observed at any particular pH is the sum of both acid and base mediated mechanisms. For many acidic hydrogens, a pH of 2.2-2.7 results in an overall minimum rate of exchange (Englander et al., Anal. Biochem. 147:234-244, 1985; Englander et al., Biopolymers 7:379-393, 1969; Molday et al., Biochemistry 11:150, 1972; Kim et al., Biochemistry 21:1, 1982; Bai et al., Proteins: Struct. Funct. Genet. 17:75-86,1993; and Connelly et al., Proteins: Struct. Funct. Genet. 17:87-92). While hydrogens in protein hydroxyl and amino groups exchange with tritium or deuterium in buffer at millisecond rates, the exchange rate of one particular acidic hydrogen, the peptide amide bond hydrogen, is considerably slower, having a half life of exchange (when freely accessible, and freely hydrogen-bonded to solvent water) of approximately 0.5 seconds at 0° C., pH 7, which is greatly slowed to a half life of exchange of 70 minutes at 0° C., pH 2.7. When a polypeptide is in a denatured, unstructured configuration (also termed a “random coil”) all of its amide hydrogens can freely exchange with solvent hydrogen. However, the precise rate of exchange varies up to 200 fold from amide to amide in such unstructured configurations, the rate of exchange at each particular amide being determined by localized primary amino acid sequence-dependent effects that can be calculated from a knowledge of the peptide's primary sequence (Bai et al., supra). When peptide amide hydrogens are buried within a folded polypeptide, or are hydrogen bonded to other parts of the polypeptide, exchange half-lives with solvent hydrogens are often considerably lengthened, at times being measured in hours to days.
Hydrogen exchange at peptide amides is a fully reversible reaction, and rates of on-exchange (solvent deuterium replacing protein-bound normal hydrogen) are identical to rates of off-exchange (hydrogen replacing protein-bound deuterium) if the state of a particular peptide amide within a protein, including its chemical environment and accessibility to solvent hydrogens, remains identical during hydrogen exchange conditions.
Hydrogen exchange is commonly measured by performing studies with proteins and aqueous buffers that are differentially tagged with pairs of the three isotopic forms of hydrogen (1H, normal hydrogen; 2H, deuterium; 3H, tritium). If the pair of normal hydrogen and tritium are employed, it is referred to as tritium exchange; if normal hydrogen and deuterium are employed, as deuterium exchange. Different physicochemical techniques are in general used to follow the distribution of the two isotopes in deuterium versus tritium exchange. The rates of exchange of other acidic protons (—OH, —NH, and —SH) are so rapid that they cannot be followed in these techniques and all subsequent discussion refers exclusively to peptide amide proton exchange.
Tritium Exchange Techniques
Tritium exchange techniques (where the amount of the isotope is determined by radioactivity measurements) have been extensively used for the measurement of peptide amide exchange rates within an individual protein. In these studies, purified proteins are on-exchanged by incubation in buffers containing tritiated water for varying periods of time, optionally transferred to buffers free of tritium, and the rate of off-exchange of tritium determined. By analysis of the rates of tritium on- and off-exchange, estimates of the numbers of peptide amide protons in the protein whose exchange rates fall within particular exchange rate ranges can be made. These studies do not allow a determination of the identity (location within the protein's primary amino acid sequence) of the exchanging amide hydrogens measured.
Extensions of these techniques have been used to detect the presence within proteins of peptide amides which experience allosterically-induced changes in their local chemical environment and to study pathways of protein folding (Englander et al., Meth. Enzymol. 26:406-413, 1972; Englander et al., J. Biol. Chem. 248:4852-4861,1973; Englander, Biochemistry 26:1846-1850, 1987; Louie et al., J. Mol. Biol. 201:765-772, 1988). For these studies, tritium on-exchanged proteins are often allowed to off-exchange after they have experienced either an allosteric change, or have undergone time-dependent folding upon themselves, and the number of peptide amide hydrogens which experience a change in their exchange rate subsequent to the allosteric/folding modifications determined. Changes in exchange rate indicate that alterations of the chemical environment of particular peptide amides have occurred which are relevant to proton exchange (solvent accessibility, hydrogen bonding, etc.). Peptide amide hydrogens which undergo an induced slowing in their exchange rate are referred to as “slowed amides” and if previously on-exchanged tritium is sufficiently slowed in its off-exchange from such amides there results a “functional tritium labeling” of these amides. From these measurements, inferences are made as to the structural nature of the shape changes which occurred within the isolated protein. Again, determination of the identity of the particular peptide amides experiencing changes in their environment is not possible with these techniques.
Several investigators have described technical extensions (collectively referred to as “medium resolution tritium exchange”) which allow the locations of particular slowed, tritium labeled peptide amides within the primary sequence of small proteins to be localized to a particular proteolytic fragment, though not to a particular amino acid.
Rosa and Richards were the first to describe and utilize medium resolution tritium techniques in their studies of the folding of ribonuclease S protein fragments (Rosa et al., J. Mol. Biol. 133:399-416, 1979; Rosa et al., J. Mol. Biol. 145:835-851, 1981; and Rosa et al., J. Mol. Biol. 160:517-530, 1982). However, the techniques described by Rosa and Richards were of marginal utility, primarily due to their failure to optimize certain critical experimental steps. No studies employing related techniques were published until the work of Englander and co-workers in which extensive modifications and optimizations of the Rosa and Richards technique were first described.
Englander's investigations utilizing tritium exchange have focused exclusively on the study of allosteric changes which take place in tetrameric hemoglobin (a subunit and b subunit 16 kD in size each) upon deoxygenation (Englander et al., Biophys. J. 10:577, 1979; Rogero et al., Meth. Enzymol. 131:508-517,1986; Ray et al., Biochemistry 25:3000-3007,1986; and Louie et al., J. Mol. Biol. 201:755-764,1988). In the Englander procedure, native hemoglobin in the oxygenated state is on-exchanged in tritiated water. The hemoglobin is then deoxygenated (inducing allosteric change), transferred to tritium-free buffers by gel permeation column chromatography, and then allowed to off-exchange for 10-50 times the on-exchange time. On-exchanged tritium present on peptide amides which experience no change in exchange rate subsequent to the induced allosteric change in hemoglobin structure off-exchanges at rates identical to its on-exchange rates, and therefore is almost totally removed from the protein after the long off-exchange period. However, peptide amides which experience slowing of their exchange rate subsequent to the induced allosteric changes preferentially retain the tritium label during the period of off-exchange.
To localize (in terms of hemoglobin's primary sequence) the slowed amides bearing the residual tritium label, Englander then proteolytically fragments the off-exchanged hemoglobin with the protease pepsin, separates, isolates and identifies the various peptide fragments by reverse phase high pressure liquid chromatography (RP-HPLC), and determines which fragments bear the residual tritium label by scintillation counting. However, as the fragmentation of hemoglobin proceeds, each fragment's secondary and tertiary structure is lost and the unfolded peptide amide hydrogens become freely accessible to H2O in the buffer. At physiologic pH (>6), any amide-bound tritium label would leave the unfolded fragments within seconds. Englander therefore performs the fragmentation and HPLC peptide isolation procedures under conditions which minimize peptide amide proton exchange, including cold temperature (4° C.) and use of phosphate buffers at pH 2.7. This technique has been used successfully by Englander to coarsely identify and localize the peptide regions of hemoglobin α and β chains which participate in deoxygenation-induced allosteric changes. The ability of the Englander technique to localize tritium labeled amides, while an important advance, remains low; at best, Englander reports that his technique localizes amide tritium label to hemoglobin peptides 14 amino acids or greater in size, without the ability to further sublocalize the label. Moreover, in Englander's work, there is no appreciation that a suitably adapted exchange technique might be used to identify the peptide amides which reside in the contacting surface of a protein receptor and its binding partner. Instead, these Englander disclosures are concerned with the mapping of allosteric changes in hemoglobin.
Unfortunately, acid proteases are very nonspecific in their sites of cleavage, leading to considerable HPLC separation difficulties. Englander tried to work around these problems, for the localization of hemoglobin peptides experiencing allosteric changes, by taking advantage of the fact that some peptide bonds are somewhat more sensitive to pepsin than others. Even then, the fragments were “difficult to separate cleanly”. They were also, of course, longer (on average), and therefore the resolution was lower. Englander concludes, “At present the total analysis of the HX (hydrogen exchange) behavior of a given protein by these methods is an immense task. In a large sense, the best strategies for undertaking such a task remain to be formulated. Also, these efforts would benefit from further technical improvements, for example in HPLC separation capability and perhaps especially in the development of additional acid proteases with properties adapted to the needs of these experiments” (Englander et al., Anal. Biochem. 147:234-244, 1985).
Over the succeeding years since this observation was made, no advances have been disclosed which address these critical limitations of the medium resolution hydrogen exchange technique. Most acid-reactive proteases are in general no more specific in their cleavage patterns than pepsin. Efforts to improve the technology by employing other acid reactive proteases other than pepsin have not significantly improved the technique.
Allewell and co-workers have disclosed studies utilizing the Englander techniques to localize induced allosteric changes in the enzyme Escherichia coli aspartate transcarbamylase (Burz et al., Biophys. J. 49:70-72, 1986; Mallikarachchi et al., Biochemistry 28:5386-5391, 1989). Burz et al. is a brief disclosure in which the isolated R2 subunit of this enzyme is on-exchanged in tritiated buffer of specific activity 100 mCi/ml, allosteric change induced by the addition of ATP, and then the conformationally altered subunit off-exchanged. The enzyme R2 subunit was then proteolytically cleaved with pepsin and analyzed for the amount of label present in certain fragments. Analysis employed techniques which rigidly adhered to the recommendations of Englander, utilizing a single RP-HPLC separation in a pH 2.8 buffer.
ATP binding to the enzyme was shown to alter the rate of exchange of hydrogens within several relatively large peptide fragments of the R2 subunit. In a subsequent more complete disclosure (Mallikarachchi, supra), the Allewell group discloses studies of the allosteric changes induced in the R2 subunit by both ATP and CTP. They disclose on-exchange of the R2 subunit in tritiated water-containing buffer of specific activity 22-45 mCi/ml, addition of ATP or CTP followed by off-exchange of the tritium in normal water-containing buffer. The analysis comprised digestion of the complex with pepsin, and separation of the peptide fragments by reverse phase HPLC in a pH 2.8 or pH 2.7 buffer, all of which rigidly adheres to the teachings of Englander. Peptides were identified by amino acid composition or by N-terminal analysis, and the radioactivity of each fragment was determined by scintillation counting. In both of these studies the localization of tritium label was limited to peptides which averaged 10-15 amino acids in size, without higher resolution being attempted.
Beasty et al., (Biochemistry 24:3547-3553,1985) have disclosed studies employing tritium exchange techniques to study folding of the a subunit of E. coli tryptophan synthetase. The authors employed tritiated water of specific activity 20 mCi/ml, and fragmented the tritium labeled enzyme protein with trypsin at a pH 5.5, conditions under which the protein and the large fragments generated retained sufficient folded structure to protect amide hydrogens from off-exchange during proteolysis and HPLC analysis. Under these conditions, the authors were able to produce only 3 protein fragments, the smallest being 70 amino acids in size. The authors made no further attempt to sublocalize the label by further digestion and/or HPLC analysis. Indeed, under the experimental conditions they employed (they performed all steps at 12° C. instead of 4° C., and performed proteolysis at pH 5.5 instead of pH in the range of 2-3), it would have been impossible to further sublocalize the labeled amides by tritium exchange, as label would have been immediately lost (off-exchanged) by the unfolding of subsequently generated proteolytic fragments at pH 5.5 if they were less than 10-30 amino acids in size. Additional references disclosing tritium exchange methods include Fromageot et al., U.S. Pat. No. 3,828,102, which discloses using hydrogen exchange to tritium label a protein and its binding partner, and Benson, U.S. Pat. Nos. 3,560,158 and 3,623,840, which discloses using hydrogen exchange to tritiate compounds for analytical purposes.
Deuterium Exchange Techniques
Fesik et al. (Biochem. Biophys. Res. Commun. 147:892-898,1987) disclose measuring by NMR the hydrogen (deuterium) exchange of a peptide before and after it is bound to a protein. From this data, the interactions of various hydrogens in the peptide with the binding site of the protein are analyzed.
Paterson et al. (Science 249:755-759, 1990) and Mayne et al. (Biochemistry 31:10678-10685,1992) disclose NMR mapping of an antibody binding site on a protein (cytochrome-C) using deuterium exchange. This relatively small protein, with a solved NMR structure, is first complexed to anti-cytochrome-C monoclonal antibody, and the preformed complex then incubated in deuterated water-containing buffers and NMR spectra obtained at several time intervals. The NMR spectrum of the antigen-antibody complex is examined for the peptide amides which experience slowed hydrogen exchange with solvent deuterium as compared to their rate of exchange in uncomplexed native cytochrome-C. Benjamin et al. (Biochemistry 31:9539-0545,1992) employ an identical NMR-deuterium technique to study the interaction of hen egg lysozyme (HEL) with HEL-specific monoclonal antibodies. While both this NMR-deuterium technique, and medium resolution tritium exchange rely on the phenomenon of proton exchange at peptide amides, they utilize radically different methodologies to measure and localize the exchanging amide hydrogens. Furthermore, study of proteins by the NMR technique is not possible unless the protein is small (generally less than 30 kD), large amounts of the protein are available for the study, and computationally intensive resonance assignment work is completed.
Subsequently, others have disclosed techniques in which exchange-deuterated proteins are incubated with binding partner, off-exchanged, the complex fragmented with pepsin, and deuterium-bearing peptides identified by single stage fast atom bombardment (Fab) or electrospray mass spectroscopy (MS) (Thevenon-Emeric et al., Anal. Chem. 64:2456-2358,1992; Winger et al., J. Am. Chem. Soc. 114:5897-5989, 1992; Zhang et al., Prot. Sci. 2:522-531, 1993; Katta et al., J. Am. Chem. Soc. 115:6317-6321, 1993; and Chi et al., Org. Mass Spectrometry 7:58-62,1993; Engen and Smith, Anal. Chem. 73:256A-265A, 2001; Englander et al., Protein Sci. 6: 1101-1109, 1997; Dharmasiri and Smith, Anal. Chem. 68:2340-2344, 1996; Smith et al., J. Mass Spectrometry 32:135-146, 1997; Deng and Smith, Biochemistry 37:6256-6262, 1998). In these studies, only the enzyme pepsin is employed to effect enzymatic fragmentation under slowed exchange conditions, and no attempt made to increase the number and quantity of useful fragments produced and studied beyond employing the methods disclosed by Englander and colleagues some decades prior. The resolution of the deuterium-exchange mass spectrometry work disclosed in these publications therefore remained at the 10-14 amino acid level, with the primary limitation of their art being the ability to generate only a small number of peptides with the endopeptidase pepsin, as they employed it. See FIG. 3 for an overview of this method of exchanged deuterium localization.
U.S. Pat. Nos. 5,658,739; 6,291,189; and 6,331,400 issued to Woods, Jr. (each of which is hereby incorporated by reference herein in its entirety), disclose improved methods of determining polypeptide structure and binding sites utilizing hydrogen-exchange-labeled peptide amides, importantly including a method of increasing the resolution of the technique to the 1-5 amino acid level. This increased ability to more precisely localize exchanged amide hydrogens was afforded by the novel use of acid-resistant carboxypeptidases to effect a subsequent progressive sub-fragmentation of the small number of relatively large-sized pepsin-generated peptides initially produced in the method (see FIG. 4 for an overview of the progressive proteolysis method). In these prior methods, finer localization of the labels is achieved by analysis of subfragments generated by controlled, stepwise, sub-degradation (“progressive degradation”) of each pepsin-generated, labeled peptide under slowed exchange conditions. According to these prior methods, the protein or a peptide fragment is said to be “progressively”, “stepwise” or “sequentially” degraded if a series of fragments are obtained which are similar to those which would be achieved with an ideal exopeptidase. Carboxypeptidase-P, carboxypeptidase Y, and several other acid-reactive (i.e., enzymatically active under acid conditions) carboxypeptidases are specified for use in said progressive degradation of peptides under acidic conditions. To date, no aminopeptidases have been reported that are acid resistant; as a practicality, the only exopeptidases known or likely to be useful for this method are therefore carboxypeptidases.
By performing such measurement of the exchange rates of peptide amide hydrogens within a protein, one can determine its stability at the individual amino acid level. Ranking and comparison of the exchange rates of a protein's amide hydrogens therefore allows direct identification and localization of structured versus unstructured regions of the protein. Despite the utility of such exchange data, the methods used to obtain it have remained labor intensive and time consuming, with substantial limitations in throughput, comprehensiveness and resolution.
High-resolution structures are required for a fundamental understanding of protein structure and function. It is widely anticipated that access to these important structures will be facilitated by novel high-throughput protein structure determination approaches and improvements to conventional crystallographic methods. Proteomic-scale crystallography is one avenue being vigorously pursued by several groups, involving large-scale global efforts (see, e.g., Stevens and Wilson, Science 293:519-520, 2001; and Stevens et al., Science 294:89-92, 2001).
Despite the availability of many enhancements that facilitate such efforts, high-throughput production of stable protein constructs that suitably crystallize continues to be a serious bottleneck. While definition of successful constructs for protein production has long been a problem for conventional crystallography, the inadequacies of current approaches are particularly acute and costly for structural genomics efforts that presently show only a 10-20% success rate in target crystallization. Bacterial genomes are currently the focus of many of the structural genomics efforts. However, a switch to higher eukaryotes, such as mouse and human, will entail even lower success rates, due in part to more complex and higher molecular weight proteins.
Thus, there remains a need in the art for improved simple, robust, quick and efficient methods whereby the structure of a protein of interest can be analyzed to efficiently define protein domain boundaries, the location of unstructured or floppy regions between or within domains, as well as disordered regions within single-domain proteins; and then modified in order to refine and optimize the processes of crystallization and crystallographic structure determination in a high-throughput manner.