How to reduce features using PCA and get good result on Pattern recognition ?

9 views (last 30 days)
I have data 200x90. (rowsx columns)
My input has 200x88 and my target has 200x2 .
My data number (200 ) is less according to 88 features to get good accuracy on my neural network.
It should be more data to get good accuracy but I can't increase my data .
So I want to use PCA to reduce 88 features.
I loaded data200 and shared codes .
When I apply pca I got that error:
% Error using network/train (line 340)
% Inputs and targets have different numbers of samples.
% Error in deneme (line 50)
% [net,tr] = train(net,x,t);
Because it decreases my rows also. I want to keep my rows number (200) and just decrease column numbers.
I couldn't manage how to apply PCA in proper way.
veri=xlsread('data200.xlsx');
input=veri(:,1:88);
target=veri(:,89:90);
input=pca(input);
x=input';
t=target';
% Solve a Pattern Recognition Problem with a Neural Network
% Script generated by Neural Pattern Recognition app
% Created 07-Feb-2021 15:50:44
%
% This script assumes these variables are defined:
%
% x - input data.
% t - target data.
% Choose a Training Function
% For a list of all training functions type: help nntrain
% 'trainlm' is usually fastest.
% 'trainbr' takes longer but may be better for challenging problems.
% 'trainscg' uses less memory. Suitable in low memory situations.
trainFcn = 'trainscg'; % Scaled conjugate gradient backpropagation.
% Create a Pattern Recognition Network
hiddenLayerSize = [10];
net = patternnet(hiddenLayerSize, trainFcn);
% Setup Division of Data for Training, Validation, Testing
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
% Train the Network
[net,tr] = train(net,x,t);
% Test the Network
y = net(x);
e = gsubtract(t,y);
performance = perform(net,t,y)
tind = vec2ind(t);
yind = vec2ind(y);
percentErrors = sum(tind ~= yind)/numel(tind);
% View the Network
view(net)
% Plots
% Uncomment these lines to enable various plots.
%figure, plotperform(tr)
%figure, plottrainstate(tr)
%figure, ploterrhist(e)
%figure, plotconfusion(t,y)
%figure, plotroc(t,y)

Answers (1)

Ayush
Ayush on 3 Apr 2024 at 4:25
Hi,
It seems that you are having trouble applying for PCA in your current implementation. You can follow the below steps:
  1. Scale your features (i.e. standardize them). You can use the "zscore" function to do the same.
  2. Use the "pca" function to reduce dimensionality and also specify the number of principal components to keep or the variance you wish to retain.
  3. Project your original data onto the reduced space obtained from PCA.
Refer to the code below for better understanding:
% Load data
veri = readtable("data200.xlsx");
veri = table2array(veri);
input = veri(:,1:88);
target = veri(:,89:90);
% Standardize the input data
inputStandardized = zscore(input);
% Apply PCA
% Here, you can specify the number of components or the variance to keep.
% For example, to keep 95% of variance, you can use:
[coeff, score, ~, ~, explained] = pca(inputStandardized);
% Determine the number of components to keep 95% variance
cumulativeVariance = cumsum(explained);
numComponents = find(cumulativeVariance >= 95, 1, 'first');
% Project the input data onto the reduced space
inputPCA = score(:,1:numComponents);
% Now, use inputPCA for your neural network training
x = inputPCA';
t = target';
% The rest of your code remains the same for creating and training the network
Below is the attached image of the resulting network that I got:
For more information on "zscore" and "pca" functions, refer to the below documentation:

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!