The present invention is directed to detectors, and more particularly to methods and apparatus for determining the boundaries of a detector response profile.
Detectors and sensors are used in industry and in research to determine the presence or absence of a molecule, material, chemical, objects or a change in a physical parameter of the environment around or in contact with the detector. By way of example, a thermometer senses changes in temperature. If viewed over time, temperature, one variable, having a value in degrees, can be compared to a second variable, time, measured in seconds, minutes or years. These variables can be graphed or plotted.
As used herein, the term xe2x80x9cdetectorxe2x80x9d is used to refer to any instrument or device which creates a signal in response to the presence or absence of a molecule or a change in a physical parameter of the environment in contact with the instrument or device. Common detectors used in industry and research include absorbance detectors, fluorescence detectors, mass spectrometry detectors, chemi-luminescence detectors, refractometry detectors, viscometry detectors, radiation detectors and thermometers.
As used herein, the word xe2x80x9cprofilexe2x80x9d refers to a depiction of a plot of data points usually with lines drawn between the data points whether in electronic form or printed. A xe2x80x9cplotxe2x80x9d refers to graphical organization of a series of data points, in electronic and printed forms. Such a plot is normally presented on an X-axis and an Y-axis where each axis represents one variable. Typically detectors measure responses at a uniform sample rate. Thus the time difference between adjoining data points is a constant that is termed the sample period. Typically, a plot of data points from a detector will have data points associated with no activity, which result in consistent readings or consistent slope. These consistent readings, representative of no activity or change, are referred to as a baseline. A change in the environment surrounding the detector will alter the profile of the plot creating a xe2x80x9cpeakxe2x80x9d or xe2x80x9cvalley.xe2x80x9d As used herein, the word xe2x80x9cpeakxe2x80x9d refers to any change in the profile, whether plotted as an upwardly projecting or downwardly descending plot. Whether the plot is directed up or down is a matter of choice and the present discussion will address each as a peak for the convenience of clarity. The words plot and profile, as used herein, are not intended to be limited to visually perceived representations. Rather, the words are used to represent how data points are managed or processed to depict information.
Chromatography is the science of separation based upon specific or nonspecific binding of molecules to a stationary phase. Aspects of the present invention have special application in gas and liquid chromatography. In liquid chromatography, a liquid carrying one or more compositions of interest is carried through a solid phase. The compositions elute from the solid phase at different times producing changes in one or more physical parameters measured by the detector. These changes are plotted over time and such graphical representations are known as a chromatograph. Chromatographs typically exhibit peaks that correspond to the compounds that have been separated. It is often desirable to direct fluids containing the compounds to vessels or further processes. By way of example, it may be useful to direct a fluid containing a compound, determined by absorbance, to a mass spectrometer to determine its molecular weight. Also, it may be useful to collect the fluid defining a separated peak into a collection vial. This is an operation known as xe2x80x9cfraction collection.xe2x80x9d Thus, chromatographic instruments often are equipped with valves for directing compounds from the common stream.
It is accepted practice to analyze a peak in order to obtain two response factors, peak height and peak area. Each of these factors gives a response that is in proportion to the amount of material injected onto the column. But the height and area can be obtained only when the underlying baseline of a peak and the start time and stop time of the peak are known. Ideally a chromatogram consists of a series of peaks, with all pairs separated by a region of baseline. These peaks are termed baseline-resolved peaks. Chromatographs, however, can be complex. A peak from a compound may appear within a cluster of peaks, or merge with other peaks, or appear not as a well-defined peak, but as a shoulder.
In the case of a baseline-resolved peak, the boundaries of the peak are taken to be the start and stop times of the peak""s baselines. The start time, also known as xe2x80x9clift-off,xe2x80x9d is when the peak first appears above the baseline response. The stop time, also known as xe2x80x9ctouchdown,xe2x80x9d is when the peak response becomes coincident with the baseline response. These times determine the baseline drawn under the peak, which is needed for the determination of height and area. It is useful to determine height and area with computers equipped with appropriate software, such as software including an integration algorithm. For the integration algorithm to be useful, it must determine lift-off and touchdown accurately and reproducibly for peaks of varying heights, and shape asymmetries.
Unfortunately, the times of xe2x80x9cliftoffxe2x80x9d and xe2x80x9ctouchdownxe2x80x9d are dependent on the height of the peak. As the height of the peak changes, the position of xe2x80x9clift-offxe2x80x9d and xe2x80x9ctouchdownxe2x80x9d change. The higher the peak, the further apart these points become. Therefore, dependence on peak height is undesirable, as it requires the practitioner to find a compromise value, or to change the value as the peak heights change. Further, the results, using prior art methods, are dependent on the baseline slope. If the slope of the baseline changes, the position of xe2x80x9clift-offxe2x80x9d and xe2x80x9ctouchdownxe2x80x9d change. Problems with this dependence on the baseline slope become significant in the case of small peaks. In the case of small peaks, positive slopes yield start and stop times that occur earlier and negative slopes yield start and stops times that occur later.
In the case of a cluster, the baseline start time is the liftoff time for the first peak in a cluster, and the baseline stop time is the touchdown time for the last peak in the cluster. Their lift off and touch down points must be determined with the presence of adjacent peaks of varying heights and resolutions and shape asymmetries.
Once lift-off and touchdown are established, the next step is to determine boundaries between peaks in a cluster. If a valley separates a pair of peaks, the determination of the boundary is straightforward. Valleys are the local minima between peak apices. The point at the minimum of the valley is the boundary, defining the stop time of the prior peak and the start time of the following peak.
The identification of an appropriate peak boundary for a shoulder is a more difficult problem. A shoulder occurs when two peaks co-elute with low enough resolution such that there is no valley between the peak apices. The shoulder cannot be detected separately from the main peak because there is no valley point separating them. Further, even if the apex of the shoulder is identified, there is no obvious means to demarcate the shoulder from the adjoining peak. The demarcation between the main peaks and the shoulder is hard to define and there is no accepted method within the prior art to demarcate a shoulder from an adjoining peak.
Accurate and reliable determination of lift-off and touch down is essential to accurate and reliable quantitation. Lift-off and touch-down establish a baseline, and it is the baseline that affects all subsequent determinations of peak heights and areas. Accuracy is compromised if the determination of baseline is erroneous or non-reproducible.
The data analysis problem for fraction collection is similar to problems found in quantitation, in that peak boundaries must be determined. Fraction collection, however, because of its real time nature, requires an algorithm that operates in real time. It must be able to handle a wide variety of situations reliably, such as can occur with peaks whose concentrations are so high as to saturate the detector.
The goal of a fraction collection algorithm is to identify times at which a valve should be opened and closed. In collecting fractions, one may want to recover one hundred (100) percent of a material, or one may only want to collect the most concentrated material in the heart of a peak. To achieve one hundred percent collection the collection valve should open at or near lift-off and close at or near touchdown. If one wants to collect in the heart of the peak, the collection valve needs to open later and close sooner. A robust fraction collection algorithm must be able to reproducibly identify in real time a variety of features in the peak.
The simplest known scheme for fraction collection is to trigger collection on a response threshold, however, this prior art method will not collect low-level peaks such that all responses fall below the threshold. While reducing the threshold will detect lower-level peaks, an unstable baseline may cause whole clusters to be collected or missed.
Another approach of the prior art for fraction collection is to trigger collection on slope thresholds. While this is an improvement over the response threshold, unpredictable results may still occur with low-level peaks on unstable baselines. Other problems with slope threshold arise due to the interaction between peak height and slope threshold. For a given value of slope threshold, the fraction of the peak collected decreases as the peak height drops. For a given peak height, the fraction of the peak collected decreases as the threshold rises. An additional problem with using slope thresholds to collect fractions is that shoulders can not be separately collected using this method.
The present invention provides an improved method and apparatus for identifying peaks and for determining the boundaries of those peaks, and boundaries of multiple peaks and shoulders within chromatograms. The method and apparatus effects more accurate identification and determination of the boundaries of these peaks for the purpose of quantitation. The method and apparatus also effects real time identification and determination of the boundaries of these peaks for the purpose of fraction collection. The boundaries of the detector response profile are used to control processes, including directing the flow of fluids, controlling heating or cooling, altering the course of chemical reactions and the like. These boundaries can also be used to extract quantitative information about the profile, such as the computation of the profile""s area between boundaries.
Embodiments of the present invention are directed to methods and apparatus for determining the boundary of a peak within a detector response profile. The profile comprises data points plotted graphically on an X and Y-axis wherein the X-axis represents a first variable and the Y-axis represents a second variable, each variable having a value. The plot at each of the data points has a slope. Data points having a slope which deviates from a consistent value define a peak, and data points having a slope with a consistent value define a baseline.
The inventive method comprises the step of determining the presence of a peak having an apex and two sides. The method further comprises the step of selecting a first data point from a plurality of data points on one side of the apex of the peak, and selecting a second data point from a plurality of data points on a side of said peak opposite the side containing the first data point. The first data point and second data point have a position on the plot with one or more distal data points. The distal data points are further removed from the apex. One or more proximal data points are closer to or are at the apex. The plot at the first data point has a first slope and the plot at the second data point has a second slope.
The method further comprises the step of comparing the first and second slopes to the slope of the line extending between such first and second data points. If the first slope and the second slope are both equal to, or are both within a selected value of, the slope of a line extending between such first and second data points, then such first and second data points are baseline and such first and second data points define the boundary of a peak. Accordingly, the process terminates.
If the first slope and second slope have a different value from the slope of a line extending between the first and second data points, one distal data point is selected and the slope of the plot at the selected distal data point is determined. If this distal data point is on the same side of the apex as the first data point, this distal data point becomes the new first data point. If this distal data point is on the same side of the apex as the second data point, this distal data point becomes the new second data point. The method then returns to the previous step that compares the slope at these two points to the slope of the line joining these two points.
The process iterates until the slope of the plot at such distal data points are of equal or within the selected value to a line extending there between. Such data points are baseline data points which define a baseline and define the boundary of the peak.
In one embodiment, the distal data point is selected as the distal data point proximal to the first or second data point having a slope which exhibits the greatest deviation from the slope of the line extending between the first and second data points.
The peak is determined by computing the second derivative of the plot to form a second derivative plot and identifying a minimum of the second derivative plot. The minimum of the second derivative plot corresponds to the apex of the peak of the plot.
The inventive method has application wherein the peak corresponds to a chemical entity flowing through a conduit, and the conduit has one or more valves that can direct the chemical entity to a further conduit, vessel or vent. The determination of the peak boundary allows the step of opening one or more valves to direct said chemical entity into the further conduit, vessel or vent with greater accuracy and reproducibility.
A further embodiment of the inventive method comprises an apparatus for processing detector response profiles. The profiles comprise a plot of data points plotted graphically on an X and Y-axis wherein the X-axis represents a first variable and the Y-axis represents a second variable, each variable having a value. The plot at each of the data points has a slope, wherein the data points having a slope which deviates from a consistent value define a peak and data points having a slope with a consistent value define a baseline.
The inventive method further comprises an apparatus having computing means for identifying a boundary of a peak. The computing means determines the presence of a peak, which peak has an apex and two sides. The computing means selects a first data point from a plurality of data points on one side of the apex of the peak and a second data point from a plurality of data points on a side of the peak opposite the side containing the first data point. The first data point and the second data point have a position on the plot with one or more distal data points. The distal data points are further removed from the apex and one or more proximal data points are closer to or are at the apex. The plot at the first data point has a first slope and the plot at the second data point has a second slope.
The computing means compares the first and second slopes to the slope of the line extending between such first and second data points. If the slope of the plot at the first data point and the second data point are equal to the slope of a line extending between such points or within a selected value, such points are baseline data points and such points define the boundary of the peak. Accordingly, the process terminates.
Where the first slope and second slope have a different value from the slope of a line extending between the first and second data points or the selected value, the computing means selects one distal data point and determines the slope of the plot at the selected distal data point. If this distal data point is on the same side of the apex as the first data point, this distal data point becomes the new first data point. If this distal data point is on the same side of the apex as the second data point, this distal data point becomes the new second data point. The computing means returns to the previous step that compares the slope at these two points to the slope of the line joining these two points.
In the event such slopes are not equal or within the selected value, computing means repeats the step of the preceding paragraph, until the slope of the plot at such distal data points are of equal or within the selected value to a line extending there between. Such data points are baseline data points which define a baseline and such data points define the boundary of said peak.
Computing means of the apparatus selects the distal data point as the distal data point proximal to the first or second data point having a slope which exhibits the greatest deviation from the slope of the line extending between the first and second data points.
Computing means of the apparatus determines the presence of the peak by computing the second derivative of the plot to form a second derivative plot and identifying a minimum of the second derivative plot. The minimum of the second derivative plot corresponds to the apex of the peak of the plot.
The apparatus of the present invention is illustratively a chromatographic instrument wherein said detector response profiles are chromatograms. Computing means can comprise a computer or a processor unit programmed to perform in the manner above.
In an illustrative embodiment, the apparatus comprises one or more conduits, control means and valves. The peak corresponds to a chemical entity flowing through one of the conduits having one or more valves that can direct the chemical entity to a further conduit, vessel or vent. Valve control means is in communication with computing means to receive a signal corresponding to the peak. Valve control means opens one or more valves to direct the chemical entity into the further conduit, vessel or vent in response to a signal from the computing means to open or close the valve(s). Valve control means may comprise a further computer or processor unit or the same computer or processor unit as the computing means.
Another embodiment of the invention is directed to a method and apparatus for determining the presence of a peak of a detector response profile. Again, the profile comprises a plot of data points plotted graphically on an X and Y-axis wherein the X-axis represents a first variable and the Y-axis represents a second variable, each variable having a value. The plot at each of the data points has a slope wherein the data points having a slope which deviates from a consistent value define a peak and data points having a slope with a consistent value define a baseline. The method comprises the step of, and the apparatus comprises computing means for, computing the second derivative of the plot to form a second derivative plot and identifying a minimum of the second derivative plot. The minimum of the second derivative plot corresponds to the apex of the peak of the plot.
A further embodiment of the invention is directed to a one-point-tangent method for demarcating the boundary between a shouldered peak and an adjoining peak. There are two implementations of this method. The apex method comprises connecting a line from a point on an apex to a downside point on the down-slope side of the apex. The slope of the profile at the downside point equals the slope of the line. The line is tangent to the peak profile at the point of contact. In this further embodiment, the algorithm for finding the tangent point is straightforward. The initial line connects the apex to the down-slope inflection point. The point on the apex remains fixed while the point on the inflection point is moved away from the apex sample point by sample point. As the point moves away the slope of the line becomes larger resulting in a decrease of the slope at the contact point. When the slope of the line is equal to or greater than the slope at the contact point the line is fixed.
The inflection point method comprises connecting a line from a point on an upslope inflection point to a downside inflection point on the down-slope side of the apex. The slope of the profile at the downside point equals the slope of the line. The line is tangent to the peak profile at the point of contact. In this further embodiment, the algorithm for finding the tangent point is straight forward. The initial line connects the up-slope inflection point to the down-slope inflection point. The point on the upslope inflection point remains fixed while the point on the down slope inflection point is moved away from the apex sample point by sample point. As the point moves away the slope of the line becomes larger resulting in a decrease of the slope at the contact point. When the slope of the line is equal to or greater that the slope at the contact point the line is fixed.
Features of the invention include provision of methods and apparatus that can locate a peak that is independent of the slope of the underlying baseline and independent of the height of the peak.