The revolution in our ability to determine the three-dimensional structures of biological macromolecules began with X-ray diffraction analysis of crystals and then was extended to the use of high-resolution magnetic resonance for proteins in non-crystalline environments. These methods have been enormously successful, and thousands of structures are now deposited in the Brookhaven Protein Databank and Nucleic Acid Databank. Often such techniques are used for rational drug design. These techniques often take many years, however, and require a sufficient amount of a pure product to allow proper analysis of the protein.
Despite the success of crystallographic and magnetic resonance approaches such as NMR in tertiary structure determination, there remain much larger numbers of proteins and nucleic acid whose structures are not known and where success remains problematic, e.g., membrane proteins and proteins with insufficient solubility for crystal formation. In addition, the various genome projects promise to identify tens of thousands of new proteins in the next few years alone that will undoubtedly create a backlog of undetermined structures that will require new high-throughput strategies if scientists are to take advantage of this vast new sequence information.
One approach that has been examined as an alternative to NMR or crystallography is chemical crosslinking. Crosslinking and monovalent labeling experiments have been carried out for many years and can provide low-resolution structural information. Cohen et al. “On the Use of Chemically Derived Distance Constraints in the Prediction of Protein Structure with Myoglobin as an Example.” J. Mol. Biol. 1980 137:9–22; Mitra et al. “Reagents for the cross-linking of proteins by equilibrium transfer alkylation.” J. Am. Chem. Soc. 1979 101, 3097. For example, amino acid surface accessibility in proteins has been probed using selective chemical modifications followed by proteolytic digestion and mass spectrometry profiling, of the resulting modified (and unmodified) peptides. Suckau et al. “Protein surface topology-probing by selective chemical modification and mass spectrometric peptide mapping.” Proc Natl Acad Sci USA. 1992 Jun. 15;89(12):5630–4; Glocker et al. “Molecular characterization of surface topology in protein tertiary structures by amino-acylation and mass spectrometric peptide mapping.” Bioconjug. Chem. 1994 Nov.–Dec.;5(6):583–90; Seielstad et al. “Analysis of the structural core of the human estrogen receptor ligand binding domain by selective proteolysis/mass spectrometric analysis.” Biochemistry. 1995 Oct. 3;34(39):12605–15; Seielstad et al. “Molecular characterization by mass spectrometry of the human estrogen receptor ligand-binding domain expressed in Escherichia Coli.” Mol. Endocrinol. 1995 Jun.;9(6):647–58; Zappacosta et al. “Surface Topology of Minibody by Selective Chemical Modifications and Mass Spectrometry.” Protein Sci. 1997 Sep.;6(9):1901–9; Scaloni, et al. “Structural investigations on human erythrocyte acylpeptide hydrolase by mass spectrometric procedures.” J Protein Chem. 1999 Apr.;18(3):349–60.
Amide hydrogen exchange experiments with subsequent proteolysis and mass spectrometry have also been used to map solvent accessible regions in protein structures Smith et al. 1997; Smith et al. “Probing the non-covalent structure of proteins by amide hydrogen exchange and mass spectrometry.” J. Mass. Spectrom. 1979 32(2): 135–146. 1997. Susceptibility to proteolysis has been employed by several groups as a measure of site accessibility, which indirectly identifies amino acid regions as exposed or buried. Papac et al. “Epitope mapping of the gastrin-releasing peptide/anti-bombesin monoclonal antibody complex by proteolysis followed by matrix-assisted laser desorption ionization mass spectrometry.” Protein Sci. 1994 Sep.;3(9):1485–92; Cohen et. al. “Probing the solution structure of the DNA-binding protein Max by a combination of proteolysis mass spectrometry.” Protein Sci. 1995 Jun.;4(6):1088–99; Gomes et al. “Proteolytic mapping of human replication protein A: evidence for multiple structural domains and a conformational change upon interaction with single-stranded DNA. Biochemistry. 1996 Apr. 30;35(17):5586–95; Zappacosta et al. “Probing the tertiary structure of proteins by limited proteolysis and mass spectrometry: the case of Minibody.” Protein Sci. 1996 May;5(5):802–13; Gervasoni et al. “Identification of the binding surface on beta-lactamase for GroEL by limited proteolysis and MALDI-mass spectrometry.” Biochemistry. 1998 Aug. 18;37(33):11660–9. Both proteolytic and acylation approaches have been applied to characterize the topology of integral membrane proteins, such as the acetylcholine receptor, for which one would expect to observe distinct patterns for cytoplasmic, extracellular and membrane spanning elements. Moore et al. “Proteolytic fragments of the nicotinic acetylcholine receptor identified by mass spectrometry: implications for receptor topography.” Biochemistry. 1989 Nov. 14;28(23):9184–91.; Massotte D, et al. “Structure of the membrane-bound form of the pore-forming domain of colicin A: a partial proteolysis and mass spectrometry study.” Biochemistry. 1993 Dec. 21;32(50):13787–94. However, one of the major limitations of these labeling strategies has been the lack of methods for rapid and unambiguous identification of the protein modifications. Further, these types of labels are of little use in determining over-all structure.
There are also several purely computational methods for predicting a protein's fold that have been examined as potential alternatives to deducing chemically the tertiary structure of a protein. However, none of these computational methods are reliable. Twenty years ago, one study showed that low-resolution distance information could determine a protein structure with distance geometry. Havel et al. “Effects of Distance Constraints on Macromolecular Conformation. II. Simulation of Experimental Results and Theoretical Predictions.” Biopolymers. 1979 18:73–81. Havel et al. reconstructed the alpha carbon backbones of bovine pancreatic trypsin inhibitor (PTI) and carp calcium-binding protein-B (carp myogen) to within 1 Å RMS of the experimentally determined structures by specifying whether each alpha carbon was closer or further than 10 Å from all other alpha carbons in the structure and using distance geometry to solve for structures which satisfied the constraints. Despite the obvious implications of this theoretical demonstration, there has been little progress in experimental approaches that might provide the required distance constraints, short of NMR and/or X-ray crystallography itself.
There is thus a need in the art for a fast, high-throughput method for determining the tertiary structure of a protein. There is also a need for methods that can provide at least a moderate resolution determination of protein structure with small amounts of protein without the need for extensive purification processes. In addition, there is a need for improved methods to orient multimeric proteins or domains.