The present invention generally relates to methods for determination of protein structure and in particular concerns the use of immobilized proteases to identify domain boundaries of proteins, and an apparatus for digesting proteins.
The molecular structure of proteins allows them play crucial roles in virtually all biological processes, including: enzymatic catalysis, transport and storage; coordinated motion; mechanical support; immune protection; generation and transmission of nerve impulses; and control of growth and differentiation. In particular, the side chains of the different amino acids that comprise proteins, enables these long macromolecules to fold into distinctive structures and form complementary surfaces and clefts and enables them to specifically recognize and interact with highly diverse molecules. The catalytic power of enzymes comes from their capacity to bind substrates in precise orientations and to stabilize transition states in the making and breaking of chemical bonds. Conformational changes transmitted between distant sites in protein molecules are at the heart of the capacity of proteins to transduce energy and information. Thus, the three dimensional structure of a protein is the key to its ability to function in virtually all biological processes.
Discussions pertaining to protein architecture concern four levels of structure which are described as follows. Primary structure is the amino acid sequence. Secondary structure refers to the spatial arrangement of amino acid residues that are near one another in the linear sequence. Some of these steric relationships are of a regular kind, giving rise to a periodic structure. Tertiary structure refers to the spatial arrangement of amino acid residues that are far apart in the linear sequence, and to the pattern of disulfide bonds. The term, quaternary structure, refers to proteins containing more than one polypeptide chain where each polypeptide chain is referred to as a subunit, and the quaternary structure refers to the spatial arrangement subunits and the nature of their contacts.
Some polypeptide chains fold into two or more compact regions that may be joined by a flexible segment of polypeptide chain. These compact globular units, called domains, range in size from about 50 to 400 amino acid residues, and they seem to be the modular units from which proteins are constructed. While small proteins may contain only a single domain, larger proteins contain a number of domains, which are often connected by relatively open lengths of polypeptide chain. Although all information required for folding a protein chain is contained in the protein""s amino acid sequence, it is not yet known how to xe2x80x9creadxe2x80x9d this information so as to predict the detailed three-dimensional structure of a protein whose sequence is known. Consequently, folded conformation currently can only be determined by an elaborate X-ray diffraction analysis performed on crystals of the protein or, if the protein is very small, by nuclear magnetic resonance (xe2x80x9cNMRxe2x80x9d) techniques. Although considerable advances are being made in the area of high field NMR, presently, the only method capable of producing a highly accurate three dimensional structure of most proteins is by the application of X-ray crystallography.
Recent advances in the field of X-ray crystallography, such as high speed computer graphics and X-ray area detection technologies, have revolutionized the pace at which three-dimensional structures can be determined. The resulting three dimensional structure produced from the protein crystals can have enormous implications in the fundamental understanding of molecular biology such as how enzymes perform various catalytic activities, switch on biological pathways, or transport molecules within the circulatory system. In the past few years the determination of protein structures important as therapeutic targets has made possible the rational design of new, more effective pharmaceuticals.
The technique of X-ray crystallography utilizes the diffraction of X-rays from crystals in order to determine the precise arrangement of atoms within the crystal. The limiting step in the technique involves the growth of a suitable crystalline sample. This requires the growth of reasonably ordered protein crystals (crystals which diffract X-rays to at least 3.0 angstroms resolution or less).
Because of the complexity of proteins, obtaining suitable crystals can be quite difficult. Typically, several hundred to several thousand individual experiments must be performed to determine crystallization conditions, each examining a matrix of pH, buffer type, precipitant type, protein concentration, temperature, etc. This process is extremely time consuming and labour intensive.
A strategic approach has been developed to identify protein domains that are amenable to NMR analysis or X-ray crystallography. Different domains of a protein may be linked together by intervening sections of polypeptide chain to form the protein molecule. Analysis of a single domain is more easily conducted in isolation from its parent protein. The determination of an individual domain structure facilitates elucidation of the parent structure. Limited proteolysis has been used to isolate and identify stable domains of proteins, which have a high likelihood of being good targets for protein structure determination. The approach is based upon the observation that low concentrations of one or more specifically chosen proteases cleave proteins into proteolytically stable domains amenable to NMR analysis or crystallography (Morin, P. E.; et aL Proc. And. Acad. Sci. 1996, 93, 10604-10608; Barswell, J. A.; et al. J. Bio. Chem. 1995, 270, 20556-20559; Pfuetzner, R. A.; et al. J. Bio. Chem. 1997, 272, 430-434; and Malhotra, A; el aL Cell 1996, 87, 127-136).
Limited proteolysis has been conducted to isolate and identify stable domains of proteins. Generally, according to this process, the protein is incubated with four to six different proteases at different concentrations for different amounts of time. Typical protease digestion reactions are conducted by dissolving an enzyme in water thereby allowing the enzyme to act on a substrate in an aqueous solution. However, the fact that the enzyme reaction is a homogeneous reaction in an aqueous solution is a great hindrance to performance of a continuous reaction in industrial applications and also makes it very difficult to recover remaining active enzymes for repeated use after the reaction. In addition, complicated operational procedures are necessary for separation and purification of the reaction product.
The digestion products can be analyzed by SDS/polyacrylamide gel electrophoresis and proteolytically stable fragments can be identified on the basis of approximate mass. These products can then be isolated by reverse phase chromatography and an accurate mass determination can be performed by mass spectrometry. The accurate mass of a proteolytic fragment is sufficient to uniquely identify the boundaries of the fragment within a sequence of a protein. The identification of the proteolytic fragment sequence facilitates the recombinant preparation of the domain in sufficient quantities for X-ray and NMR analysis.
A limitation of the aforementioned strategy is the time consuming nature of cleaving, identifying and isolating the domains of a protein from the digestion solution.
The present invention provides a method and device to assist in the determination of protein domain boundaries. In accordance with an aspect of the invention there is provided a method for the preparation of proteolytically digested fragments of a protein in one step for purification and further processing for determination of domains and their boundaries in the protein, The method developed by present inventor comprises a one step degradation of protein which comprises contacting a quantity of the protein with two or more concentrations of one or more immobilized proteases for a time sufficient to allow degradation of the protein to provide digested fragments and then separating the immobilized protease from the fragments. Each protease is immobilized in a compartment in an apparatus which comprises a plurality of compartments, each compartment containing a quantity of a protease immobilized on a surface in each compartment. According to a preferred embodiment each compartment contains a concentration of the protease distinct from the concentration of the protease in every other compartment. Preferrably the surface upon which the protease is immobilized has been treated with a blocker of surface interaction. The protein is contacted with the protease in each compartment at about the same time.
In accordance with a further aspect of the invention the method is automated allowing for simultaneous addition of a quantity of the protein to each of the compartments. Downstream automation removes protolytically treated fragments to a purification step to yield one or more samples for further processing and structure determination. Accordingly, the present invention provides a high throughput method and device for enzymatically cleaving proteins into domains which allows for an efficient means to identify protein domain boundaries for use in protein crystallization, in a manner that is amenable to automation.
In accordance with one aspect of the present invention a device of the invention is connected downstream to an automatic means of adding a protein solution, or a solution of a protein fragment, and upstream to an automatic means of removing and subjecting the proteolytically digested product to a purification step.
In accordance with another aspect of the present invention there is provided a method for determining the boundaries of a proteolytically digested fragment of a protein which comprises the steps: (i) incubating a protein, or a fragment thereof, with at least one immobilized protease to yield protein fragments; and (ii) subjecting the resulting protein fragments to one or more purification steps to isolate the fragments of interest; (iii) subjecting the isolated fragment(s) to either nanospray or matrix-assisted time-of-flight mass spectrometry; (iv) matching the mass of the proteolytic fragment to the protein sequence of the protein originally digested.
According to another aspect of the present invention there is provided an apparatus for degradation of a protein, where the apparatus comprises a plurality of compartments, with at least two different concentrations of protease in separate compartments. In a preferred embodiment each compartment contains a concentration of protease distinct from the concentration of protease in every other compartment.
The present invention also includes a kit containing the device of the present invention together with appropriate reagents and instructions for its use.
Once proteins domains have been determined by the methods of the present invention, these protein domains can be used in screens of protein-protein interaction, such as for example, by affinity chromatography.
Advantages of the present invention include the primary factor that proteases and protease fragments do not substantially contaminate the proteolytically digested protein fragments as the proteases are immobilized and not in solution.
Another advantage is the ability to re-use the treated plates. Another advantage is the reproducibility and as mentioned, high throughput of protease digestion.
Because many devices can be generated at the same time, there is an efficiency of production and consistency of immobilized protease concentration that can be attained by this invention that is not currently available in the art. Some of the plates with an immobilized protease can be mass produced and stored for at least one week, at 4xc2x0 C., while maintaining activity. Plates can be manufactured as standards or as custom models as requested.
In contrast to autolytic proteases (and mixtures of proteases that can degrade due to one protease cleaving another protease, in solution), the ready use format of the present invention provides an immobilized product that: (1) is essentially unable to undergo autolytic cleavage; and (2) is essentially unable to degrade due to one protease cleaving another protease.