The present invention relates to analyzing chemical reactions and, more particularly, but not exclusively to systems and methods for automatic classification of assays of chemical reactions.
Monitoring chemical assays, in real time, through photometric measurements, such as real-time Polymerase Chain Reaction (PCR) and Quantitative Fluorescent Polymerase Chain Reaction (QF-PCR), produces a time series of values.
The values may be represented in a two dimensional graph depicting spectral changes over time, say of a real-time PCR based assay. The values may further be represented in a three dimensional graph depicting spectral changes vs. molecule length vs. time, say of a Capillary PCR based assay, etc., as known in the art.
For example, the spectral changes may include Fluorescence Intensity (FI) values measured over a PCR reaction apparatus, as known in the art. The measured FI values are indicative of the number of specific molecules detected in the PCR reaction.
The values measured may be used, to classify the kind of chemical reaction, which the assay involves.
For example, in QF-PCR, a graph representing the values measured over time may have linear properties, which indicate no amplification takes place. The graph may include a sigmoid curve interval, which indicates that a DNA amplification reaction occurs within the reaction apparatus.
Parameters extracted from the graph are used to determine the properties of the amplification. The right combination of parameters, say slopes of the graph in selected points, may indicate the existence of a specific subject, say existence of a specific bacterial DNA sequence.
In Capillary PCR, DNA fragments length is used to determine the DNA structure. Capillary PCR may further indicate existence or absence of known fragment patterns, or variation from certain known patterns. The patterns may indicate the existence, absence or mutation, of a specific Gene.
In RT-PCR (Reverse-Transcriptase PCR), one of the above mentioned methods, may be used, to determine Gene Expression, under specific conditions.
With Gene Expression there is found whether a certain DNA sequence (i.e. a certain gene) may be used for manufacturing RNA (say using Qf-RT-PCR), or the structure of the manufactured RNA.
Traditionally, the classification of the assays is based on manual examination by an expert in the field. The expert manually examines hundreds or thousands of samples (say thousands of graphs derived from QF-PCR based assays), detects certain features in the samples, and classifies each sample to one of two or more groups of chemical reactions.
Some currently used methods provide for automatic detection of certain features in chemical reactions, such as PCR.
For example, US Patent Publication No. 20070148632, to Kurnik et al., describes Systems and methods for determining characteristic transition values such as elbow values in sigmoid or growth-type curves, utilizing a Levenberg-Marquardt (LM) regression processes.
U.S. patent application Ser. No. 11/861,188, to Kurnik et al., filed on Sep. 25, 2007, entitled “PCR elbow determination using curvature analysis of a double sigmoid”, describes a method utilizing a first or second degree polynomial curve that fits the a growth type curve, and determination of a statistical significance value for the curve fit. The significance value indicates whether the data represents significant or valid growth.
Some of the currently used methods are based on a multi-variant statistical model.
For example, PCT Patent Application No.: PCT/IB2006/051025, filed on Apr. 4, 2006, to Tichopad et al., entitled “Assessment of Reaction Kinetics Compatibility between Polymerase Chain Reactions”, describes the usage of a large training set, to statistically compare properties of chemical assays.
Similarly, Wold et al, describe in a 1977 article, entitled “SIMCA: A method for analyzing chemical data in terms of similarity and analogy”, in Kowalski, B. R., ed., Chemometrics Theory and Application, American Chemical Society Symposium Series 52, Wash., D.C., American Chemical Society, p. 243-282, a method which requires availability of a large training data set of samples, with a set of attributes and class memberships.
The above described methods rely on a training set built manually, by an expert. In order to build the training set, the expert has to manually examine hundreds or thousands of samples. The expert further has to classify each sample into one of two or more groups of chemical reactions.
The manual classification of the assays may also be based on a set of logical conditions, used to validate a new sample, against a constructed list of decisions, say using a decision tree, as known in the art.