transform
Description
transforms the predictor data Xtransformed
= transform(IncrementalMdl
,X
)X
into principal component scores using the
incremental PCA model IncrementalMdl
. Xtransformed
is
a representation of X
in the principal component space described by
IncrementalMdl
. For more information, see pca
.
Examples
Perform Incremental Learning Incorporating Principal Component Analysis
Create a model for incremental principal component analysis (PCA) and a default incremental linear SVM model for binary classification. Fit the incremental models to streaming data and analyze how the principal components, model parameters, and performance metrics evolve during training. Use the final models to predict activity labels.
Load and Preprocess Data
Load the human activity data set. Randomly shuffle the data.
load humanactivity n = numel(actid); rng(0,"twister") % For reproducibility idx = randsample(n,n); X = feat(idx,:); Y = actid(idx);
For details on the human activity data set, enter Description
at the command line.
Responses can be one of five classes: Sitting, Standing, Walking, Running, or Dancing. Dichotomize the response by identifying whether the subject is moving (actid
> 2).
Y = Y > 2;
Specify the first 20,000 observations and labels as streaming data, and the remaining observations and labels as test data.
n = 20000; Xstream = X(1:n,:); Ystream = Y(1:n,:); Xtest = X(n+1:end,:); Ytest = Y(n+1:end,:);
Create Incremental Models
Create a model for incremental PCA. Specify to standardize the data, keep 3 principal components, and set a warm-up period of 2000 observations.
IncrementalPCA = incrementalPCA(StandardizeData=true, ...
NumComponents=3,WarmupPeriod=2000);
details(IncrementalPCA)
incrementalPCA with properties: IsWarm: 0 NumTrainingObservations: 0 WarmupPeriod: 2000 Mu: [] Sigma: [] ExplainedVariance: [3x1 double] EstimationPeriod: 1000 Latent: [3x1 double] Coefficients: [0x3 double] VariableWeights: [1x0 double] NumComponents: 3 NumPredictors: 0
IncrementalPCA
is an incrementalPCA
model object. All its properties are read-only. By default, the software sets the hyperparameter estimation period to 1000 observations. The incremental PCA model must be warm (all hyperparameters are estimated) before the fit
function returns transformed observations.
Create a default incremental linear SVM model for binary classification by using the incrementalClassificationLinear
function.
IncrementalLinear = incrementalClassificationLinear; details(IncrementalLinear)
incrementalClassificationLinear with properties: Learner: 'svm' Solver: 'scale-invariant' BatchSize: 1 Beta: [0x1 double] Bias: 0 FitBias: 1 FittedLoss: 'hinge' Lambda: NaN LearnRate: 1 LearnRateSchedule: 'constant' Mu: [] Sigma: [] SolverOptions: [1x1 struct] EstimationPeriod: 0 ClassNames: [0x1 double] Prior: [1x0 double] ScoreTransform: 'none' NumPredictors: 0 NumTrainingObservations: 0 MetricsWarmupPeriod: 1000 MetricsWindowSize: 200 IsWarm: 0 Metrics: [1x2 table]
IncrementalLinear
is an incrementalClassificationLinear
model object. All its properties are read-only. IncrementalLinear
must be fit to data before you can use it to perform any other operations. By default, the software sets the metrics warm-up period to 1000 observations and the metrics window size to 200 observations.
Fit Incremental Models
Fit the IncrementalPCA
and IncrementalLinear
models to the streaming data by using the fit
and updateMetricsAndFit
functions, respectively. To simulate a data stream, fit each model in chunks of 50 observations at a time. At each iteration:
Process 50 observations.
Overwrite the previous incremental PCA model with a new one fitted to the incoming observations.
Return the transformed observations
Xtr
.Overwrite the previous incremental classification model with a new one fitted to the incoming transformed observations.
Store , the cumulative metrics, and the window metrics to see how they evolve during incremental learning.
Store
topEV
, the explained variance of the component with the highest variance, to see how it evolves during incremental learning.
numObsPerChunk = 50; nchunk = floor(n/numObsPerChunk); ce = array2table(zeros(nchunk,2),"VariableNames",["Cumulative" "Window"]); beta1 = zeros(nchunk,1); topEV = zeros(nchunk,1); % Incremental learning for j = 1:nchunk ibegin = min(n,numObsPerChunk*(j-1) + 1); iend = min(n,numObsPerChunk*j); [IncrementalPCA,Xtr] = fit(IncrementalPCA,Xstream(ibegin:iend,:)); IncrementalLinear = updateMetricsAndFit(IncrementalLinear,Xtr, ... Ystream(ibegin:iend)); beta1(j + 1) = IncrementalLinear.Beta(1); ce{j,:} = IncrementalLinear.Metrics{"ClassificationError",:}; topEV(j + 1) = IncrementalPCA.ExplainedVariance(1); end
During the incremental PCA estimation and warm-up periods, the fit
function returns the transformed observations as NaNs. After the PCA estimation period and warm-up period, updateMetricsAndFit
fits the linear coefficient estimates using the transformed observations. After the metrics warm-up period, IncrementalLinear
is warm, and updateMetricsAndFit
checks the performance of the model on the incoming transformed observations, and then fits the model to those observations.
Analyze Incremental Models During Training
To see how the highest explained variance, , and performance metrics evolve during training, plot them on separate tiles.
figure t = tiledlayout(3,1); nexttile plot(topEV) ylabel("Top EV [%]") xline(IncrementalPCA.EstimationPeriod/numObsPerChunk,"r-.") xlim([0 nchunk]) ylim([0 100]) nexttile plot(beta1) ylabel("\beta_1") xline((IncrementalPCA.WarmupPeriod+ ... IncrementalPCA.EstimationPeriod)/numObsPerChunk,"b:") xlim([0 nchunk]) nexttile h = plot(ce.Variables); xlim([0 nchunk]) ylabel("Classification Error") xline((IncrementalLinear.MetricsWarmupPeriod+ ... IncrementalPCA.WarmupPeriod+ ... IncrementalPCA.EstimationPeriod)/numObsPerChunk,"g--") legend(h,ce.Properties.VariableNames) xlabel(t,"Iteration")
The highest explained variance value is 0 during the estimation period and then rapidly rises to 73%. The value then gradually approaches 77%.
The plots suggest that updateMetricsAndFit
performs these steps:
Fit after the estimation and warm-up periods only.
Compute the performance metrics after the estimation, warm-up, and metrics warm-up periods only.
Compute the cumulative metrics during each iteration.
Compute the window metrics after processing 200 observations (four iterations).
Predict Activity Labels Using Final Models
Transform the test data using the final incremental PCA model. Predict activity labels for the transformed test data using the final incremental linear classification model.
transformedXtest = transform(IncrementalPCA,Xtest); predictedLabels = predict(IncrementalLinear,transformedXtest);
Create a confusion matrix for the test data.
figure ConfusionTrain = confusionchart(Ytest,predictedLabels);
The final model misclassifies only 27 of 4075 observations in the test data.
Input Arguments
IncrementalMdl
— Incremental PCA model
incrementalPCA
model object
Incremental PCA model, specified as an incrementalPCA
model object. You can create
IncrementalMdl
by calling incrementalPCA
directly.
X
— Chunk of predictor data
floating-point matrix
Chunk of predictor data to transform, specified as a floating-point matrix of
n observations and IncrementalMdl.NumPredictors
variables. The rows of X
correspond to observations, and the columns
correspond to variables.
Note
transform
supports
only numeric input data. If your input data includes categorical data, you must
prepare an encoded version of the categorical data. Use dummyvar
to convert each categorical variable to a numeric matrix of
dummy variables. Then, concatenate all dummy variable matrices and any other numeric
predictors. For more details, see Dummy Variables.
Data Types: single
| double
Output Arguments
Xtransformed
— Principal component scores
floating-point matrix
Principal component scores, returned as a floating-point matrix. The rows of
Xtransformed
correspond to observations, and the columns
correspond to components.
Version History
Introduced in R2024a
See Also
incrementalPCA
| pca
| reset
| fit
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: United States.
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)