1. Field of the Invention
The present invention generally relates to microbatch crystallization under oil using a high throughput method. Microbatch crystallization under oil requires very low volumes of both a protein and a crystallization cocktail solution. This is critical for a high throughput application where a large number of experiments are being conducted at the same time. Many proteins can be difficult if not impossible to obtain in large quantities and cocktail solutions are costly to produce. According to the present invention, one plate of 1,536 crystallization experiments is set up using as little as 600 xcexcl of protein solution. From this volume, 100 xcexcl of the protein is retrieved at the conclusion of the experiments for use in further studies.
2. Prior Art
A number of investigators have attempted to condense their experiences in the crystal growth laboratory into a list of recipes of reagents that have found success as crystallizing agents. The most used of these is the list compiled by Jancarik, J. and Kim, S.-H. (1991), J. Appl. Cryst. 24, 409-411 which is often referred to as the xe2x80x9csparse matrix samplingxe2x80x9d screen. The list is a xe2x80x9cheavily biasedxe2x80x9d selection of conditions out of many variables including sampling pH, additives and precipitating agents. The bias is a reflection of personal experience and literature reference towards pH values, additives and agents that have successfully produced crystals in the past. Commercialization of the sparse matrix screen has led to its popularity; easy and simple to use, it is often the first strategy in the crystal growth lab.
The agents chosen by Jancarik and Kim are designed to maximize the frequency of precipitation outcomes for a broad variety of proteins. They were chosen because in a large percentage of experiments employing them xe2x80x9csomething happenedxe2x80x9d. This highlights a fundamental difference between the present invention and the approach taken by Jancarik, Kim and their successors. The latter try to identify sets of chemical agents that maximize the probability of inducing precipitation (preferably crystallization) across the board, while the present invention relies on a set of chemical agents with precise patterns of precipitation, patterns that are as diverse as possible among the proteins that constitute the information repository. The fact that the sparse matrix approach does not always work has led to the design of other lists targeted at proteins. These include those by Cudney, R., Patel, S., Weisgraber, K., Newhouse, Y. and McPherson, A. (1994), Acta Cryst. D50, 414-423 directed to nucleic acids, by Berger, I., Kang, C. H., Sinha, N., Wolters, M., and Rich, A. (1996), Acta Cryst. D52, 465-468 directed to other classes of macromolecules, and by Garavito, M. (1991), in xe2x80x9cCrystallization of Membrane Proteinsxe2x80x9d, H. Michel (Ed.), CRC Press, pp. 89-105.
The sparse matrix approach is based on unbiased attempts to sample the multi-dimensional space of crystallization parameters. At least 23 parameters have been identified as having had an effect on crystallization outcomes. If one were to attempt a simple, exhaustive two-level experimental design for an unknown protein, i.e., two pH values, two temperatures, two kinds of crystallizing agents, etc., it would require 223 or over eight million experiments. Hence the need for sampling.
Carter, C. W. (1997), in Methods in Enzymology 276, 74-99 made major advances in the area of crystal growth by applying partial factorial designs, principally incomplete factorial designs. In these designs, relative levels of important chemical factors are sampled to achieve good coverage and good balance in the sampling. However, incomplete factorial designs are no more than (or no less than) scaffolds upon which the crystal grower must build experiments.
In other words, once the crystal grower has defined the multi-dimensional space which should be sampled, the factorial designer chooses from the large number of possible experiments those that should be executed to insure good coverage of the space identified. The crystal grower must decide upon the important variables to be tested, and the limits on those variables within which to sample. The machinery of factorial design offers no guidance on those issues. Other sampling strategies based on orthogonal arrays by Kingston, R. L., Baker, H. M. and Baker, E. N. (1994), Acta Cryst D50, 429-440 and on random samplings by Shieh, H.-Y., Stallings, W. C., Stevens, A. M., and Stegeman, R. A. (1995), Acta Cryst. D51, 305-310 have been described as well.
A fundamentally different approach to strategic planning of crystallization experiments is one in which physical principles believed to augur well for success are exploited. This class includes the work of Rixc3xa8s-Kautt, M. and Ducruix, A. (1997), Methods in Enzymology 276, 23-59 who have investigated solubility determinants for proteins as a function of pH and pI. In particular, Rixc3xa8s-Kautt, Ducruix and co-workers investigated the Hofmeister series developed by Cacace, M. G., Landau, E. M., and Ramsden, J. J. (1997), Quart. Res Biophys. 30, 241-277 and found that protein solubility follows the series or its reverse, depending on the pH of the experiment and the pI of the protein. Also within this approach are recent advances by George, A., Chiang, Y., Guo, B., Arabshaki, A., Cai, Z., and Wilson, W. W. (1997), Methods in Enzymology 276, 100-109 in the use of light scattering as a predictive tool and by George, A. and Wilson, W. W. (1994), Acta Cryst. D50, 361-365 who have shown that a dilute solution property, the second virial coefficient of the osmotic pressure lowering, falls within a narrow range of values (the xe2x80x9ccrystallization slotxe2x80x9d) for solutions conducive to crystallization. Work by Rosenbaum, D., Zamora, P. C., and Zukoski, C. F. (1996), Phys. Rev. Lett. 76, 150-153; Rosenbaum, D. and Zukoski, C. F. (1996), J. Crystal Growth 169, 752-758; Gripon, C., Legrand, L., Rosenman, I., Vidal, O., Robert, M. C., and Bouxc3xa9, F. (1997), J. Crystal Growth 177, 238-247 and 178, 575-584; and Gripon, C., Legrand, L., Rosenman, I., Bouxc3xa9, F., and Regnaut, C. (1998), J. Crystal Growth 183, 258-268 suggest that the second virial coefficient is a fundamentally important determinant of crystallization from aqueous protein solutions.
The final approach to strategic planning tactics is the construction and analysis of the Biological Macromolecule Crystallization Database (BMCD), in which details of macromolecular crystallizations abstracted from the primary literature have been collected. The BMCD was created by Gilliland, G. L., Tung, M., Blakeslee, D. M., and Ladner, J. E. (1994), Acta Cryst. D50, 408-413 and has, over the last decade, grown to include crystallization data on over three thousand crystal entries covering over two thousand distinct macromolecules (Version 3.0). The record structures of the BMCD, while not requiring any particular record to be complete, include entries for the macromolecule, the crystal data, the crystallization conditions, the primary literature references, and a field for comments. These data have been abstracted, where available, from the primary literature and there are entries for every major class of macromolecule (protein, nucleic acid, virus, etc.) that have been studied in the diffraction lab. Each record is a record of successxe2x80x94there are no records describing crystallization experiments that failed to yield crystals. Gilliland has pointed out that the data in the BMCD xe2x80x9chave not been verified and the information present in this data set often represents the author""s [Gilliland""s] interpretation of the literaturexe2x80x9d.
Gilliland was first to analyze the BMCD to develop crystal growth strategies for macromolecules. He showed that ammonium sulfate and polyethylene glycol were favored crystallizing agents and that vapor diffusion was a favored crystallization method. While both observations were part of the common lore of crystal growth, Gilliland used the BMCD to quantitate their use.
Samuzdi and co-workers delved more deeply into the BMCD, looking for general strategies that might be effectively used for smaller sub-populations of the database. The statistical tool they chose to employ was cluster analysis. Using version 1.0 of the BMCD in 1992, Samuzdi, C. T., Fivash, M. J., and Rosenberg, J. M. (1992), J. Crystal Growth 123, 47-58 and version 3.0 in 1998, Farr, R. G., Perryman, A. L., and Samuzdi, C. T. (1998), J. Crystal Growth 183, 653-668 searched for clusters involving the following parameters: molecular weight, macromolecular concentration, pH, temperature, crystallizing agent type, and crystallization method. Focusing on very recent results Samuzdi identified 25 clusters within the BMCD that were judged statistically distinct. Fully a third of the clusters (8 out of 25) were sparsely populated and were, therefore, ignored in further treatment. Clustering identified nucleic acids, protein-nucleic acid complexes and viral assemblies as behaving distinctly in successful crystallizations from the general class of soluble proteins. A further weak distinction between the behaviors of very small proteins and all other proteins was drawn, but apart from the method of crystallization no other single parameter (macromolecular concentration, pH, temperature, crystallizing agent type) was shown to cluster in any significant manner. While Samuzdi reports strategies for the 17 populated clusters, it is virtually impossible on the basis of molecular weight (the only intrinsic property of the macromolecule that could be used as a pointer) to decide which strategy to employ for any particular protein.
Hennessy, D., Gopalakrishnan, V., Buchanan, B. G., Rosenberg, J. M., and Subramanian, D. (1994), Proceedings, Second International Conference on Intelligent Systems for Molecular Biology, ISBM-94, AAAI Press, pp. 179-187 took a different approach. They attempted to use the BMCD Version 1.0 to induce rules for macromolecular crystallization. This so called machine induction is an automatic construction of arguments from the particular to the general which attempts to identify xe2x80x9ca disjunctive set of weighted conjunctive rulesxe2x80x9d. Conjunctive rules are of the form IF (A and B and C, etc.) OR IF (D and E and F, etc.), THEN (conclude, do) something. An example of a very simple rule might be: if the crystal habit has the value xe2x80x9cplatesxe2x80x9d then the diffraction limit is under 3.5 xc3x85. Rules are generated in an automatic fashion and then are tested against the data to see if they hold.
In actual applications with databases as small and sparse as the BMCD, the depth of the rules generated is severely limited because their numbers grow exponentially and there are insufficient data to adequately test complicated rules. When the rules outnumber the data, it is difficult to evaluate if one rule is to be preferred over another. To counter this problem, Buchanan incorporated xe2x80x9cdomain knowledge to guide the induction of rulesxe2x80x9d. Here the domain in question is the crystal growth domain. While the formal techniques employed to introduce xe2x80x9cdomain knowledgexe2x80x9d into the logic are described, it is unclear how they were implemented in detail to limit and guide the rule generation and testing. Buchanan pointed out that the absence of negative results (crystallization failures) in the BMCD severely hampered the search for crystal growth rules. Rules that would be most useful, such as xe2x80x9cif you carried out the following crystallization experiment, you would observe the following resultsxe2x80x9d were not induced, suggesting that rule-based approaches to strategy planning would not likely succeed.
Finally, Bob Cudney, owner of Hampton Research, Laguna Hills, Calif., in the commercial pamphlet xe2x80x9cCyrstallizatin Research Toolsxe2x80x9d has surveyed the BMCD and produced graphs of the frequency of successful employment of various crystallizing agents and of various pH values that give a feel for the limits on each when contemplating a crystallization screening. In combination with formal search techniques such as incomplete factorial designs, the analyses put forth by Cudney are extremely useful.
In that light, the present invention relates to an integrated decision-support system that aids the crystal grower in devising successful crystallization strategies. The goal is to be able to predict, through analysis of carefully selected sets of precipitation reactions, the key elements of successful crystallization strategies. This strategy is predicated on the principle that successful crystallization strategies employed for similar proteins are the best guide when plotting strategies for new proteins. In other words, the pattern of outcomes in successful precipitation reactions yields the objective measure of similarity for designing new reaction experiments. This requires an objective measure of similarity between successfully grown crystals and those being planned.
This objectivity is provided by the execution, evaluation and binary-encoding of the outcomes of reactions involving hundreds of proteins and precipitating agents. In method and form these precipitation reactions are indistinguishable from microbatch crystallization experiments: solubilized proteins are incubated with agents that have the potential to reduce their solubilities, aggregation and phase separation either does or does not result, and the extent with which it does is assessed visually. The distinction is that the present method is a high throughput process. The many outcomes are then used to develop a set of precipitation reaction indices that allow the crystal grower to efficiently, objectively and quantitatively evaluate the similarity of any two proteins with respect to a physical property intimately connected to crystal growth, namely, solubility.
In that respect, the goal of the present invention is to develop an opening strategy that gets the crystal grower into optimization experimentation quickly. On the basis of a small number of precipitation reactions, i.e., 1,536 precipitation reactions, requiring less than a milligram of protein, taking no more than a few hours to set up and perhaps as little as one day to evaluate, the crystal grower is able to propose the crystallization method of choice, the crystallizing agent, the pH and temperature, and approximate concentration ranges for all solutes. The practical implications are tremendous.
Structural biology is at a point where the floodgates of structure determination are beginning to open. Among methods employed to reveal the details of molecular structure, none rivals single crystal X-ray diffraction for its generality of application, clarity of view, and lack of ambiguity in interpretation. Entry into the diffraction method is via growth of a suitable single crystal of the target macromolecule. The crystal growth problem has repeatedly been identified as the rate-limiting step in macromolecular structure determination.
According to the present invention, experiences with similar crystallization problems, successfully engaged in the past, are the best guide to the solution of new crystallization problems in the future. The present invention, therefore, provides a predictive, objective, quantitative and absolute measure of similarity in experimental outcomes called a xe2x80x9cprecipitation reaction index.xe2x80x9d This index is based on the results of 1,536 precipitation reactions between an unknown protein and 1,536 standardized cocktail solutions, and is the link between structural, physical, chemical and biological properties of macromolecules and their behavior in crystallization experiments. In that respect, experimentally determined precipitation reaction indices of known proteins and the unknown protein with the 1,536 standardized cocktail solutions are strongly linked to both crystallization outcomes and to the intrinsic properties of macromolecules. By analyzing three independent types of data (intrinsic properties, precipitation reaction indices, and crystallization strategies), it is believed that previously unsuspected, non-trivial relationships between intrinsic properties and crystallization outcomes will significantly aid the crystal grower in a fundamental understanding of the crystal growth process.
These and other objects of the present invention will become increasingly more apparent to those skilled in the art by reference to the following description and to the appended drawings.