Neuroimaging researchers frequently acquire multi-modality image data and various non-imaging measurements. For example, FDG-PET and structural (e.g., volumetric) MRI brain images as well as a complete battery of neuropsychological tests are acquired from each healthy subject every two years in our NIH sponsored longitudinal APOE-ε4 study. In their study of imaging neurofibrillary tangles and beta amyloid plaques using 2-(1-[6-[(2-[18 F]fluoroethyl)(methyl)amino]-2-naphthyl]ethyli-dene)Malononitrile (FDDNP) (Shoghi-Jadid, K. et al. 2002), Researchers from UCLA acquired triple imaging datasets, FDG-PET, FDDNP-PET and T1 weighted volumetric MRI. Similarly, Researchers at the University of Pittsburgh used dual PET tracers, FDG and PIB in their study of imaging brain amyloid in AD (Klunk, W. E. et al. 2004). The availability of multi-modality imaging datasets provides researchers an opportunity to examine multi-processes simultaneously and yet poses a methodological challenge in having the multi-datasets optimally integrated and utilized for the understanding of the underlining biological system.
There have existed methods that make use of data from one image modality for the analysis of another. People have long used image fusion technique for localizing functional findings with the anatomical map provided by structural images (as an example, see (Reiman, E. M. et al. 2004)). Similarly, region of interest (ROI) defined on the anatomical images can be used to extract data from functional dataset to investigate experimental condition manipulated brain responses. Taking the advantage of high resolution, volumetric MRI has also been routinely used to correct the combined effects of partial volume average and atrophy related to the functional images (Pietrini, P. et al. 1998). In the FDG-PET study, this correction allows researchers to determine if the underlining cause of the observed brain functional alternations is purely glucose metabolic pathway or mostly the structural relate (Reiman, Chen, Alexander, Caselli, Bandy, Osborne, Saunders, and Hardy 2004). Besides these procedures listed here and used in mostly structural-functional studies, findings from one imaging modality are often correlated with the that from another imaging modality or from non-imaging measurement using conventional correlation analysis (Shoghi-Jadid, Small, Agdeppa, Kepe, Ercoli, Siddarth, Read, Satyamurthy, Petric, Huang, and Barrio 2002). Overall, the approaches listed here are relative straightforward and mostly in the context of analyzing primarily the data from one single-modality using another, supportive and secondary. In contrast, our approach proposed in the current study, multi-modality, inter-networks and multivariate in nature, is to establish the optimal way to link multi-datasets and to combine the information from each of the datasets for enhancing researcher's ability to detect alternations related to the experimental conditions or the onset, progress or treatments related to the study of diseases.
As mentioned above, our approach will be multivariate in nature. Multivariate analysis has been long used in single-modality studies complementary to univariate analysis. These single-modality, intra-network and multivariate analysis, model-based or data-driven, are to characterize brain inter-regional covariances/correlations. These methods, voxel- or ROI-based, included principal component analysis (PCA) (Friston, K. J 1994), the PCA-based Scaled Subprofile Model (SSM) (Alexander, G E and Moeller, J R 1994), independent component analysis (McKeown, M. J. et al. 1998; Duann, J. R. et al. 2002) (McKeown, Makeig, Brown, Jung, Kindermann, Bell, and Sejnowski 1998; Arfanakis, K. et al. 2000; Moritz, C. H. et al. 2000; Calhoun, V. D. et al. 2001; Chen, H. et al. 2002; Esposito, F. et al. 2003; Calhoun, V. D. et al. 2003; Schmithorst, V. J. and Holland, S. K. 2004; Beckmann, C. F. and Smith, S. M. 2004) and the Partial Least Squares (PLS) method (McIntosh et al. 1996; Worsley, K. J. et al. 1997). Also included are Multiple correlation analysis (Horwitz, B 1991; Horwitz, B. et al. 1999), structure equation model (Mcintosh, A. R. and Gonzalez-Lima, F 1994; Horwitz, Tagamets, and McIntosh 1999), path analysis (Horwitz, B. et al. 1995; Worsley, K. J. et al. 1997), and dynamic causal modeling (Friston, K. J. et al. 2003). These methods have typically been used to characterize regional networks of brain function (and more recently brain gray matter concentration (Alexander, G et al. 2001)) and to test their relation to measures of behavior. No one of these multivariate methods, however, has been used to identify patterns of regional covariance among multi-imaging datasets.
Motivated by the availability of the multi-neuroimaging datasets and encouraged by the success of single-modality network analysis, especially the PLS works, we set out searching for tools that allow us to seek for the maximal linkage among the multi-datasets or to optimally combine them for increased statistical powers. We believe dual-block PLS (DBPLS) as well as multi-block PLS (MBPLS) should be the first set of tools we would like to explore for such purpose. We will list the challenges and difficulties in performing inter-modality analysis using PLS and our very own plan for further methodological development later. First, however, a review is in demand for the general PLS methodology, the success of DBPLS in the neuroimaging field (mainly by McIntosh and his colleagues) and that of MBPLS mainly in the chemometrics and bioinformatics areas.
Review of the PLS Method
Citing from the Encyclopedia for research methods for the social sciences, PLS regression is a relative recent technique that generalizes and combines features from PCA and multiple regressions. It is particularly useful when one needs to predict a set of dependent variables from large set(s) of independent variables (Abdi, H. 2003).
The traditional use of PLS regression is to predict (not to link) dependent dataset Y from c (c≥1) independent datasets X1, . . . Xc, hence the term of PLS regression. Note that in this writing the variables in each dataset are arranged column-wise in the data matrix. In addition to the PLS regression, we are also interested in its use to describe the linkages among multi-dataset without the labeling of dependent or independent. With details of the PLS linkage methodology developments to be described later, we provide here a review of the PLS regression methodology. In a sense, PLS is not needed when Y is a vector (single variable dataset) and X is full rank (assuming c=1) as the Y-X relationship could be accomplished using ordinary multiple regression. For our neuroimaging studies, especially our inter-network analysis, the number of voxels/variables is greater than one, and in fact much larger than the number of subjects/scans, multicollinearity exists for each dataset. Several approaches have been developed to cope with this problem when Y is a vector (which is not the case in our neuroimaging study). The approach, called principal component regression, has been proposed to perform a principal component analysis (PCA) of the X matrix and then use the principal components of X as regressors on Y. Though the orthogonality of the principal components eliminates the multicollinearity problem, nothing guarantees that the principal components, which explain X, are relevant for Y (Abdi 2003). By contrast, PLS regression finds components from X that are also relevant for Y. Specially, PLS regression searches for a set of components that performs a simultaneous decomposition of X and Y with the constraint that these components explain as much as possible of the covariance between X and Y (Abdi 2003). The procedure of finding the first PLS regressor is equivalent to maximize the covariance between a linear combination of the variables in Y and a linear combination of the variables in X (the paired linear combinations are referred to as the first latent variable pair). This maximal covariance is symmetrical for Y and X for this first latent variable pair. Symmetry here is referred to as the irrelevancy of the fact which dataset is designated as dependent. The symmetry is lost for subsequent latent pairs, however, as is demonstrated below.
DBPLS Algorithm:
As mentioned above, DBPLS uncovers the sequential maximal covariance between two datasets by constructing a series of latent variable pairs. Starting from original data matrices X and Y (with standardization necessary), the first latent variable pair is constructed as follows. The latent variable of X is t=Σwixi where wi is scalar, and x, is the ith column of X (i=1, 2, . . . ). In matrix form, t=Xw where w=(w1, w2, . . . )T with ∥w∥=1. Similarly, the Y latent variable can be expressed as u=Yc (∥c∥=1). In the context of dual-imaging datasets and for matter of convenience, we will refer w and c as singular image of X and Y respectively. The covariance of the two latent variables, t and u, is therefore cov(t,u)=w′X′Yc (assuming zero mean for variables in both datasets). The maximal covariance value with respect to w and c can be proven to be the square root of the largest eigenvalue of the matrix Ω=[X′YY′X] with w being the corresponding eigenvector of Ω, and c being the corresponding eigenvector of Y′XX′Y. Prior to the second latent variable pair, the effects of the first latent variable pair needs to be regressed out from X and Y, referred as deflation in the chemometrics PLS literature:
Express and
            p      1        =                            X          ′                ⁢        t                                        t                          2              ,            q      1        =                            Y          ′                ⁢        u                                        u                          2              ,            r      1        =                            Y          ′                ⁢        t                                        t                          2            and calculate new X1 and Y1 as X1=X−tp1′ Y1=Y−tr1′
The same calculating procedure will then be repeated for the new X1 and Y1 matrix pair to construct the second latent variable pair. The third and remaining latent variable pairs (up to the rank of X) will be calculated similarly.
MBPLS Algorithm:
The calculation of MBPLS is based on the DBPLS procedure described above, with some kind scheme of deflation to take care of the presence of more than one independent block. Westerhuis et al described the following numerical procedure (Westerhuis, J. A. and Smilde, A. K. 2001):                1, Calculate the first latent variable pair of the DBPLS model between X=[X1, . . . , Xc] and Y. The scores t and u, weight w and loadings p and q are obtained. From these, the multiblock PLS block weights wb, the super weights ws and the block scores tb are obtained.        2, wb=w(b)/∥w(b)∥2         3, tb=Xbwb         4, ws(b)=tbTu/uTu        5, ws=ws/∥ws∥2         Block score deflation        6a, pb=XbTtb/tbTtb         7a, Eb=Xb-tbpb         8a, F=Y−tq        Super score deflation        6b, Eb=Xb−tp(b)T         7b, F=Y−tqFor additional components, set X=[E1, . . . , Ec] and Y=F and go back to step 1.        
Different deflation step can be used playing a crucial part in MBPLS calculation. The block score deflation, suggested by Gerlach and Kowalski (Gerlach, R. W. and Kowalski, B. R. 1979), led to inferior prediction. Westerhuis et al. showed that super score deflation gave the same results as when all variables were kept in a large X-block and a DBPLS model was built. The super scores summarize the information contained in all blocks, whereas the block scores summarize the information of a specific block. However, the super score deflation method mixes variation between the separated blocks and therefore leads to interpretation problems. In order to overcome the mixing up of the blocks, deflating only Y using the super scores was proposed (Westerhuis and Smilde 2001). This leads to the same predictions as with super score deflation of X, but because X is not deflated, the information in the blocks is not mixed up.
Review of DBPLS in the intra-modality neuroimaging studies
McIntosh and his colleagues first introduced DBPLS into the neuroimaging field in 1996 (McIntosh, Bookstein, Haxby, and Grady 1996) for the intra-modality spatial pattern analysis in relationship to behavior or experimental conditions. Consequent to this study, Worsley considered an alternative PLS procedure, what he referred to as the orthonormalized PLS (Worsley, Poline, Friston, and Evans 1997) to account for the issue of being invariant to arbitrary linear transformations. Ever since, DBPLS works have been extended, improved and introduced extensively to various brain studies mainly by McIntosh and his group. Their efforts included further methodological developments such as the extension from PET to functional MRI studies, from the original PLS to seed-PLS (McIntosh, A. R. et al. 1999) or spatiotemporal-PLS (Lobaugh, N. J. et al. 2001; Lin, F. H. et al. 2003) and numerous applications in brain function/disease studies (McIntosh, A. R. 1998; McIntosh, A. R. 1999; Rajah, M. N. et al. 1999; O'Donnell, B. F. et al. 1999; Anderson, N. D. et al. 2000; Iidaka, T. et al. 2000; Lobaugh, West, and McIntosh 2001; Nestor, P. G. et al. 2002; Keightley, M. L. et al. 2003; Habib, R. et al. 2003). Another significant contribution from McIntosh's group is the introduction of the non-parametric inference procedures, permutation or Bootstrapping for intra-modality PLS neuroimaging studies (for example, see the initial introduction paper (McIntosh, Bookstein, Haxby, and Grady 1996)).
Review of DBPLS in the Inter-Modality Neuroimaging Studies
Presented on the World Congress on Medical Physics and Biomedical Engineering at Sydney, Australia in 2003 (Chen, K et al. 2003), our group reported the inter-network preliminary results linking FDG-PET to MRI segmented gray matter overcoming a huge computing obstacle related to the size of the covariance matrix between two imaging datasets (number of voxel in one image data set x the number of voxels in another). Our aim is to seek direct linkage or regression between dual-modality imaging datasets (MBPLS regression or MBPLS linkage analysis).
One year later, researchers from McIntosh's group reported alternative approaches for analyzing multi-modality imaging data at 13th Annual Rotman Research Institute Conference Mar. 17-18, 2004 (Chau, W et al. 2004). They used the same operational procedure as in their intra-modality PLS studies in attempting to answer the same question: the experimental condition or behavior related neuroimaging covarying patterns. In other words, the roles of neuroimaging datasets are only and always the X's blocks in the PLS regression notation above with the experimental conditions or behavior data as the dependent Y block (Chau, Habib, and McIntosh 2004). Since the direct linkage between/among multi-modality datasets is not the purpose of their investigation, there exist no needs to computationally deal with the issue of the covariance matrix sizes. Also, since the number of X blocks is more than one, investigation on the deflation scheme is needed, but not was considered in their study.
Review of DBPLS and MBPLS in Chemometrics and Bioinformatics
Though the successes of the DBPLS in the neuroimaging field have been indeed impressive, the application of MBPLS in this field is yet to be matured, its success demonstrated, and new algorithms developed. Numerous successful applications of both DBPLS and MBPLS, however, have been reported in the field of fermentation and granulation for food or pharmacological industries. The importance of PLS in Chemometrics field is evidenced by the online editorial in the Journal of Chemometrics (Hiskuldsson, A 2004). An incomplete MBPLS review in these fields is provided here together with some discussion on their relevance to our intended neuroimaging applications.
Esbensen at al., analyzed data of the electronic tongue (an array of 30 non-specific potentiometric chemical sensors) using PLS regression for qualitative and quantitative monitoring of a batch fermentation process of starting culture for light cheese production (Esbensen, K. et al. 2004). They demonstrated that the PLS generated control charts allow discrimination of samples from fermentation batches run under “abnormal” operating conditions from “normal” ones at as early as 30-50% of fully evolved fermentations (Esbensen, Kirsanov, Legin, Rudnitskaya, Mortensen, Pedersen, Vognsen, Makarychev-Mikhailov, and Vlasov 2004). Relevant to our proposal, this study is a clear demonstration of the MBPLS prediction power based on multi historical datasets, the power that a physician dreams to duplicate for early diagnosis of a disease.
In another study (Lopes, J. A. et al. 2002), the performance of an industrial pharmaceutical process (production of an active pharmaceutical ingredient by fermentation) was modeled by MBPLS. With the multiblock approach, the authors were able to calculate weights and scores for each independent block (defined as manipulated or quality variables for different process stage). They found that the inoculum quality variables had high influence on the final active product ingredient (API) production for nominal fermentations. For the non-nominal fermentations, the manipulated variables operated on the fermentation stage explained the amount of API obtained. As demonstrated in this study, the contributions of individual data blocks to the final output can be determined. The neuroimaging analog of their study is to use PLS to evaluate the relative contribution of various datasets (MRI, FDG-PET, neuro-psychological tests) in accurately predicting the onset of AD or in evaluating the effects of treatments.
Hwang and colleagues discussed the MBPLS application to the field of tissue engineering in one of their recent publications (Hwang, D. et al. 2004). They used MBPLS model to relate environmental factors and fluxes to levels of intracellular lipids and urea synthesis. The MBPLS model enabled them to identify (1) the most influential environmental factors and (2) how the metabolic pathways are altered by these factors. Moreover, the authors inverted the MBPLS model to determine the concentrations and types of environmental factors required to obtain the most economical solution for achieving optimal levels of cellular function for practical situations. The multi datasets (or multi-groups as referred by them) included the group of environmental factors and C groups, each of them consisting of a number of metabolites and fluxes that have similar metabolic behaviors. Like the one by Lopes et al., this study illustrates the power of MBPLS to assess the relative importance of each independent dataset in predicting the behavior of interests. Moreover, this study showcases the use of MBPLS to determine the variable combinations that give rise to the optimal level of the dependent variables.
Note that the MBPLS applications reviewed above are all in the framework of multiple-independent (predictor) blocks and a single dependent block, all consisting of no more than N number of variables, where N is a ten-thousand times smaller than the number of voxels/variables in the neuroimaging datasets.
Relative to neuroimaging, a major challenge to the multivariate analysis of regional covariance with multiple imaging modalities is the extremely high dimensionality of the data matrix created by including relatively high-resolution neuroimaging datasets. What is needed is a strategy to make computation of high dimensional datasets using multivariate methods feasible.