1. Field of the Invention
The present invention present concerns a computerized method and apparatus for processing data using the general linear model, wherein the data are composed of a number of data sets with a plurality of different random samples.
2. Description of the Prior Art
Data processing methods are known wherein, for each independent random sample contained in the data set, its dependency on an order quantity is compared with the dependency on the order quantity in model functions contained in a model matrix G. The comparison is made using the general linear model in order to check the occurrence of specific characteristics in the dependency of the random sample on the order quantity. Calculations from the data (which calculations are necessary for the comparison) are implemented in a predetermined sequence of the data sets for all relevant data of a data set and are stored as intermediate results. The intermediate results of an immediately preceding data set are updated with the new calculations; such that at any time an intermediate result and a final result according to at least one pass through all data sets exists, from which a conclusion can be derived about the occurrence of the characteristics in the dependency of the random sample on the order quantity.
There exists a need to acquire information about brain activity in human and animal organs, in particular in the field of medical technology and medical research. Neuronal activation is expressed as an increase of the blood flow in activated brain areas, leading to a decrease of the blood deoxyhemoglobin concentration. Deoxyhemoglobin is a paramagnetic substance that reduces the magnetic field homogeneity and thus can be shown with magnetic resonance techniques, since it accelerates the T2* signal relaxation.
A localization of brain activity is enabled by the use of a functional imaging method that measures the change of the NMR signal relaxation with a time delay. The biological effective mechanism is known in the literature as the BOLD (Blood Oxygen Level Dependent) effect. Fast magnetic resonance imaging enables the BOLD effect to be examined in vivo dependent on activation states of the brain. In functional magnetic resonance tomography, magnetic resonance data acquisitions of the subject volume to be examined (for example the brain of a patient) are made at short temporal intervals. A stimulus-specific neuronal activation can be detected and spatially localized by comparison of the signal curve measured by functional imaging for each volume element of the subject volume with the time curve of a model function. The stimulus can be, for example, a somatosensory, acoustic, visual or olfactory stimulus as well as a mental or motor task. The model function or the model time series describes the expected signal change of the magnetic resonance signal as a result of neuronal activation. Slight temporal intervals between the individual measurements can be realized by the use of fast magnetic resonance techniques such as, for example, the echo planar method.
In many multivariate statistical analyses, a model known as the general linear model (GLM) is used for the comparison of the measured signal curve with the time curve of a model function. The general linear model is a least squares fit of measurement data to one or more model functions. With the aid of the general linear model it is determined which linear combination of the model functions optimally approximates the measurement data series to the greatest extent possible.
Furthermore, for each model function a calculation can be made as to how significantly the measurement data of the null hypothesis of no contribution contradict the respective model function regarding the measurement data series. The general linear model is used for analysis of measurement data in many fields such as, for example, physics or sociology. It is also suitable for analysis of time series measured in functional magnetic resonance imaging (fMRI). By use of the general linear model, it can be analyzed whether the measured time series show a pattern that corresponds to the local neuronal activity. In addition to this pattern, however, the time series frequently also show other characteristics such as, for example, drifts or other effects that can likewise be modeled in the framework of the general linear model. This allows a better analysis of the measurement data than for example, t-test or correlation methods. For example, with the general linear model it is also possible to analyze a number of effects in the brain in parallel. Group statistics over a number of test subjects are also possible. Further application possibilities of the general linear model are found, for example, “Human Brain Function” by R. Frackowiak et al., Academic Press.
In the processing of measurement data that are acquired from a subject volume with the method of functional magnetic resonance tomography conventionally, it has been initially necessary to load the entire measurement data (which are composed of a number of volume data sets created by temporally successive measurements) into the main memory of the computer. The signal curve or the time series would then have to be subsequently extracted for each volume element of the measured subject volume and compared with the respective model function. In the known implementation of the general linear model in the freely available SPM software (Wellcome Department of Cognitive Neurology; University of London; published under Gnu Public License; http://www.fil.ion.ucl.ac.uk/spm/), it is likewise necessary to load the complete data set to be analyzed (which data set can contain several hundreds of megabytes, up to gigabytes of data given long fMRI studies (possibly also for multiple test subjects)) into the main memory of the computer. The values to be analyzed, which belong to a time series of measurement data, are then extracted and the general linear model is directly calculated.
To reduce the main memory requirement as well as the calculation time in the calculation of the general linear model, methods for incremental calculation of the general linear model are proposed in WO 03/016824 A2 and DE 102 54 606 A1. In methods of this type, calculations (required for the comparison) from the measurement data are implemented (in an order resulting from the temporal sequence of the measurements) for all relevant measurement data of a data set and are stored as intermediate results. The intermediate results are respectively updated with the new calculations, such that at any time an intermediate result and an end result according to the at least one pass through all data sets exist from which a conclusion can be derived about the occurrence of the characteristics in the measured curve.
However, it has become apparent that intermittent problems can occur in the calculation of the intermediate results. The time curve of the measurements can be interpreted from a voxel as an n-dimensional vector. From a geometric perspective, m model functions form the basis of an m-dimensional sub-space on which the vector of the measurement data is projected. In order to be able to clearly specify the vector of the projection in the m-dimensional sub-space on the basis of the model functions, the model functions cannot be collinear. For a practical application with numerical stability, the model functions must contain portions sufficiently orthogonal to one another. Model functions that exhibit sufficiently orthogonal portions when considered over the entire time curve of the measurement, when the time segments are considered individually, can be exactly or nearly collinear to these time segments. This prevents an evaluation in the corresponding time segment or the associated data set because the calculations are not defined or does not supply numerically stable results.
E. Bagarino et al., “Estimation of general linear model coefficients for real-time application”, NeuroImage 19 (2003) 422-429, describe a method for incremental calculation of the general linear model in which calculations (required for the comparison) from the measurement data likewise result in an order of the data sets that arises from the temporal sequence of the measurements. In this method, the model functions of the general linear model are initially orthogonalized with Gram-Schmidt methods. The ancillary coefficients associated with the orthogonalized functions are then estimated based on the measurement data, and the coefficients of the model functions are finally calculated from the ancillary coefficients. This enables a reliable calculation of intermediate results of the method.