Can the output of plsregress be used to calculate Q residuals and T2 for new X data

8 views (last 30 days)
Assume we have spectral data xcal, ycal, xval, yval where
xcal is mxn : m spectra, or observations, of a sample, n wavelengths per spectrum
ycal is mx1 : m concentrations of the sample corresponding to the m observations in xcal
xval is 1xn : 1 new spetrum or new observation of the sample (ie, not a member of xcal)
yval is 1x1 : 1 new concentration of the sample corresponding to the observation in xval
assuming m>n and ncomp<n and xcal0 is xcal with its mean subtracted,
xcal0 = xcal - ones(m,1)*mean(xcal)
[XL,YL,XS,YS,BETA,PCTVAR,MSE,STATS] = PLSREGRESS(xcal,ycal,ncomp);
Can be used to compute Q residuals, or the rowwise sum of squares of the STATS.XResiduals matrix
and
STATS.T2, is the Hotelling T^2 value
for each of the m observations in xcal
Q residuals and T2 values can be used to determine if the observations in xcal are outliers
Can the outputs of plsregress as described above be used to compute a Q residual and a T^2 value for the single observation in xval to determine if it seems to be an outlier with respect to xcal?

Answers (1)

Suraj Kumar
Suraj Kumar on 3 Sep 2024
Edited: Suraj Kumar on 3 Sep 2024
Hi Jeremy,
To calculate Q residuals and T2 values from the outputs of ‘plsregress’ function in MATLAB and determine if a new observation is an outlier or not, you can go through the following steps and the attached code snippets:
1. Perform the PLS regression and preprocess the new data that you want to observe.
% Perform PLS regression
[XL, YL, XS, YS, BETA, PCTVAR, MSE, STATS] = plsregress(xcal, ycal, ncomp);
2. Project the data onto the latent space using the loading matrix (XL) from the ‘plsregress’ function, which gives the scores for the new observations.
t_val = xval0 * XL;
3. Calculate the T2 statistic by using the scores (XS) and the variance captured by each principal component.
% T² value
T2_val = (t_val ./ sqrt(var(XS))) * t_val';
4. For Q residuals, reconstruct the observation in the original space using the scores and loadings and calculate the residuals as the difference between the original and reconstructed observation.
% Q residuals
xval_pred = t_val * XL';
xval_residual = xval0 - xval_pred;
Q_val = sum(xval_residual.^2);
5. For assessing outliers, you can compare the calculated T2 and Q residual values to the pre-defined thresholds.
To know more about the ‘plsregress’ function in MATLAB, you can go through the following page:
Hope this works for you!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!