The present invention relates to the field of protein structure determination via mass spectrometry. More particularly, the present invention provides the ability to analyze the structure of intact proteins, using distance constraints obtained from the analysis of MS/MS spectra of proteins after cross-linking, via a top-down approach.
Proteins are a class of compounds composed of α-amino acid residues, covalently bonded through amide linkages after elimination of water between the carboxy group of one amino acid and the amino group of another amino acid. A protein can be considered a polymer consisting of a larger number of α-amino acid residues.
Proteins are complex polymers, containing carbon, hydrogen, nitrogen, oxygen, and sulfur, and comprised of linear chains of amino acids connected by peptide links.
Understanding the structure of proteins is important for a complete understanding of the physiological reactions involving proteins. The structure of a protein is typically described by its primary, secondary, tertiary, and quaternary structures. The amino acid sequence of the protein defines the primary structure. Proteins seldom form random coils and the high specificity of their function depends on a defined conformation of the polypeptide chain, in a secondary structure. The most common types of secondary structures are α-helices and β-sheets. The elements of secondary structure may be connected via loops and turns of various types into a larger tertiary structure. The present invention is concerned with elucidating the secondary and tertiary structure of a given protein. Proteins may also consist of several folded polypeptide chains (known as sub-units) which associate with each other not through covalent peptidic bonds, but through non-covalent interactions. The present invention can also be used to probe the quaternary structure.
Determination of the three-dimensional structures of proteins has traditionally been accomplished through the use of x-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy, and both techniques produce high resolution data. However, the preparation of large amounts of pure analyte in a certain solution, or growth of a suitable crystal for analysis is difficult and time-consuming. After meeting these conditions, a substantial amount of data acquisition and analysis is required and it can take weeks to months to complete a picture of the molecular structure of a protein.
The number of novel proteins discovered in recent years has dramatically increased, and the time-consuming traditional techniques of structural determination discussed above are not keeping pace. An alternative approach to structure determination that could match the rate of the identification of new proteins is provided by the present invention. The present invention uses cross-linking reagents, which can provide sufficient low-resolution interatomic distance constraints to solve the tertiary structure of a protein when combined with state-of-the-art computational methods.
This invention relates to a specific approach to the new method for protein structure determination involving a top-down approach, versus a bottom-up approach. The entire crude cross-linked protein mixture is injected into an electrospray ionization Fourier transform mass spectrometer (ESI-FTMS) instrument, for example, and the cross-link positions localized by multiple stages of fragmentation and mass spectrometry.
A bottom-up approach has typically been used in applications such as protein identification via peptide mass mapping and protein structure elucidation using hydrogen/deuterium exchange, chemical labeling and cross-linking. In the bottom-up approach, after purification of a protein, the protein is digested by a proteolytic enzyme such as trypsin, and then masses of the resulting peptides are measured using mass spectrometry. Identifiable peptides from a single proteolysis typically represent only 50-90% of the protein sequence, complicating the identification of mass modifications in the remainder of the protein sequence. In addition, false mass values commonly appear in spectra, which result from self-proteolysis and protein impurities.
In a recent paper, Kelleher et al. (J. Am. Chem Soc. 1999, 121, 806-812) describe the advantages of the top-down versus bottom-up approach to protein characterization by tandem high-resolution mass spectrometry. In the top-down approach Kelleher et al. chose conditions that gave limited dissociation of the ionized protein, which gave a small number of large fragments where one or more complementary sets of fragments, the masses of which sum to the value of the expected mass of the protein, can easily be identified.
In the present invention, chemical cross-linking is performed before sample cleanup. Purification of the cross-linked species occurs in the gas phase within the mass spectrometer. The proteins are also ‘digested’ within the mass spectrometer, using, but not limited to, techniques such as collision induced dissociation (CID), infrared multiphoton dissociation (IRMPD), and electron capture dissociation (ECD). The fragmentation conditions can be varied to give minor fragmentation, yielding large complementary fragments, or extensive fragmentation useful for localization of the cross-links.
More recently, the utilization of chemical cross-linking in conjunction with mass spectrometry to elucidate three-dimensional protein structures has been disclosed
Patterson et al., U.S. Pat. No. 5,821,063, disclose methods for sequencing polymers utilizing mass spectrometry. In particular, the methods of Patterson et al. involve varying ratios of hydrolyzing agent to polymer and integrating mass spectral data obtained from the analysis of a series of hydrolyzed polymer fragments. The methods of Patterson et al. provide an optional use of statistical interpretation paradigms and computer software. Patterson et al. also require the hydrolysis of polymers before they are introduced into the mass spectrometer. The present invention, however, utilizing the top-down approach, does not require this step because intact proteins are injected into the mass spectrometer. Moreover, the present invention is capable of determining the three-dimensional structure of biological macromolecules. Therefore, the methods of the present invention, unlike Patterson et al., do not require preliminary hydrolysis and yield three-dimensional structural information.
Woods, Jr., U.S. Pat. Nos. 6,291,189 B1 and 6,331,400 B1, discloses methods of labeling polypeptides and proteins with heavy hydrogen to aid in the analysis of protein structure and the fine structure of protein binding sites. However, the methods disclosed require degradation of the polypeptide, or protein, into peptide fragments which are then analyzed by mass spectrometry in a bottom-up approach. Again, the methods of the present invention utilize the top-down approach, where analysis of intact proteins is possible.
Schneider et al., U.S. Pat. No. 6,379,971, disclose methods for sequencing proteins involving labeling proteins and subsequently analyzing the proteins in a mass spectrometer wherein the proteins undergo mass spectral fragmentation. Although Schneider et al. use in-source fragmentation, they use this technique in order to determine the primary structure of a polypeptide. In contrast to the present invention, the use of cross-linking and the top-down approach in mass spectroscopy teaches to the secondary and tertiary structure.
The advantage of the methods of the present invention over previously used methods is the utilization of high resolution mass spectrometry of intact proteins. New instrumental developments for enhancing the signal from the desired modified proteins, and methods for producing controlled protein fragments in the mass spectrometer, in order to eliminate complex microseparations, are disclosed herein. Also disclosed herein are preparatory chemical steps necessary for the analysis of the methods disclosed herein.
The use of chemical cross-links to elucidate protein structure has been previously disclosed in the art, and therefore, will not be discussed in great detail. Young et al. (“High Throughput Protein Fold Identification by Using Experimental Constraints Derived From Intramolecular Cross-Links and Mass Spectrometry,” Proc. Natl. Acad. Sci. (USA), 2000, 97, 5802-5806) describe the use of chemical cross-links in the determination of protein structure. The approach, unlike the present invention, utilizes mass spectrometry of fragment ions of proteins, generated using chemical or enzymatic cleavage of proteins. According to the present invention, where the more efficient top-down approach is used, enzymatic digestion is unnecessary as intact proteins may be introduced into the mass spectrometer. Moreover, the complexities of preparatory separations may also be avoided, such as determining the proper conditions for enzymatic digestion with trypsin and the separation and purification of peptides with high-pressure liquid chromatography (HPLC).