The invention relates to analysis of molecular structures using spectropolarimeters. In particular, an apparatus and method are provided to determine the proportional combination of structures of optically active substances in solution.
The technique of spectropolarimetry or circular dichroic analysis (CD) is the preferred technique for determining the structures of optically active molecules in solution. Many alternative techniques exist, including nuclear magnetic resonance, hydrogen exchange, calorimetry, fluorescence polarization, fluorescence lifetime, steady state fluorescence measurement, differential light absorption measurements and standard spectrophotometry all of which yield information about the structure of molecules including those that are not optically active. However, CD is preferred because it is sensitive to basic structural forms and defines the percentages of these forms globally in the molecule over a wide range of conditions. The alternative techniques are either insensitive to structure directly or measure structure much more locally than CD.
The general method for utilizing CD data to obtain structure in test solutions of molecules is to develop a set of basis vectors. A set of basis vectors is a set of spectra which characterizes the authentic differential absorption of the left and right circularly polarized light in samples that are putatively of one pure form only. For example, spectra can be vectors generated to represent protein molecules that are entirely in the alpha helical, beta sheet or other pure forms. Based on the type of spectra (basis vectors) created for the pure forms, protein or other molecules are characterized by these pure forms.
Particular molecules in a solution can be modeled on the basis of a set of appropriately chosen pure forms. For example, a block copolymer crystal is observed to be substantially in one of several fundamental forms. If it can be shown that the copolymer in solution is also this fundamental form then it can provide an experimentally estimated vector for a molecule that is purely of the fundamental form.
Generally, a set of basis vectors representing all of the forms that are present in significant quantities in a sample of a molecule is generated. Typically, for proteins, this includes a minimum of four or five forms. In some situations eight or nine vectors may be necessary to distinguish certain closely related forms.
Conventionally, a sample spectrum of the sample molecule is generated and stored in a memory as a sample vector. Standard mathematical techniques are used to produce a linear combination of the basis vectors which will be used to characterize this sample molecular structure. Alternatively, a basis set can be used which is a set of vectors each of which is from samples thought to have well characterized structures. These are then substitute basis vectors. An estimate of the percentage of each of the basis forms contained in the sample molecular structure is then generated based on mathematical combinations of the substitute basis vectors that are optimized to fit the sample vector. The estimate is a model of the sample molecule structure. Knowledge of the structure of each of these substitute basis vectors together with this model allows a further estimate of the predominant structure of the sample molecule in terms of a true basis set.
The fundamental problem with the above mentioned approach is the conventional assumption that the sample molecule is predominantly in one structural form. In general this is incorrect because the sample molecule exists in more than one structural form in a solution. For example, proteins at body temperature are very close to 100% in a native state. Unfortunately, under industrial conditions, high temperatures, low temperatures, high pH's, low pH's, very low salt, very high salt, or the presence of surface modifying organic materials such as detergents, a significant percentage of the molecules are destabilized into one or more alternative forms. Accordingly, assuming the existence of one structural form when several structural forms exist leads to meaningless results.
In addition, current techniques generate the fundamental vectors under different conditions than the conditions under which the sample vectors are generated. Thus, even if current techniques are modified to account for a mixture of sample molecular structures, the fundamental vectors cannot be used to estimate the sample molecular structures. Further, estimated sample vectors using current techniques do not fit actual sample vectors very well.