function [result] = multisvm(TrainingSet,GroupTrain,TestSet)
%Models a given training set with a corresponding group vector and
%classifies a given test set using an SVM classifier according to a
%one vs. all relation.
%
%This code was written by Cody Neuburger cneuburg@fau.edu
%Florida Atlantic University, Florida USA...
%This code was adapted and cleaned from Anand Mishra's multisvm function
%found at http://www.mathworks.com/matlabcentral/fileexchange/33170-multi-class-support-vector-machine/
GroupTrain=GroupTrain';
u=unique(GroupTrain);
numClasses=length(u);
%TestSet=TestSet';
%TrainingSet=TrainingSet';
result = zeros(length(TestSet(:,1)),1);
%build models
for k=1:numClasses
%Vectorized statement that binarizes Group
%where 1 is the current class and 0 is all other classes
G1vAll=(GroupTrain==u(k));
models {k} = fitcsvm(TrainingSet,G1vAll);
end
%classify test cases
for j=1:size(TestSet,1)
for d=1:numClasses
if(predict(models{d},TestSet(j,:)))
break;
end
end
result(j) = d;
%--------------------------------
end
%disp(result);
%disp(GroupTrain);
load Group_Test
Group_Test1 = Group_Test1;
%disp(Group_Test1);
%Accuracy = mean(Group_Test1==result)*100;
%fprintf('Accuracy = %f\n', Accuracy);
%fprintf('error rate = %f\n ', length(find(result ~= Group_Test1 ))/length(Group_Test1'));
c=0;
for j=1:size(TestSet,1)
if Group_Test1(j)==result(j)
c = c+1;
end
end
acc = c/100
end

2 Comments

DGM
DGM on 9 Aug 2021
I'm assuming that the equality test in the if statement in the screenshot is never true. If these are floating point numbers, that's entirely possible.
The error in the predict statement

Sign in to comment.

 Accepted Answer

Walter Roberson
Walter Roberson on 9 Aug 2021

1 vote

result(j) is going to be a class number, an integer represented in double precision.
GroupTrain is not necessarily an integer class number at all, and is not necessarily consecutive from 1 even if it is integer. All we know is that it is a datatype that unique() can be applied to and that == comparisons works for.
For example if GroupTrain is 10, 20, 30, then u = unique() of that would be 10, 20, 30, and the code would loop through training based upon whether the class was 10, then whether it was 20, and so on. Then it would loop over classes, and use predict() and if the prediction was non-zero then it would record the class index rather than u() indexed at the class index. So predictions might be perfect, but it would be 1, 2, 3 recorded, and those would not match the 10, 20, 30s of the classes.

21 Comments

The error in the predict statement
I disagree. I think the error is in
result(j) = d;
and that it should be
result(j) = u(d);
There is no change, there is an error, the accuracy is still zero
We will need your code and your data to test with. You can zip it all up and attach the .zip .
Your label code needed a lot of change.
But even then your accuracy was zero because of poor choice of training parameters.
Revised code is attached.
Along the way, I made it easier to configure for different directories.
Your label had to be changed a lot.
The accuracy is now up to about 19% . To get higher accuracy, you will need to do a bunch of testing with the options as described in the Tips section of the fitcsvm() documentation, such as adjusting Nu or alpha or similar parameters.
You may see messages about particular classes not having converged. If you change verbose to 1 you will get a bunch of output showing that for those classes and those parameters, the fitting is not doing well. Changing parameters has the potential to help a lot.
When I changed the imresize from 200*50 to 400*200, I got this error, I need the size to be 400 * 200.
Error using classreg.learning.impl.CompactSVMImpl/score (line 63)
You must pass X as a matrix with 42336 columns.
Error in classreg.learning.classif.CompactClassificationSVM/score (line 591)
f = score(this.Impl,X,true,varargin{:});
Error in classreg.learning.classif.ClassificationModel/predict (line 411)
scores = score(this,X,varargin{:});
Error in classreg.learning.classif.CompactClassificationSVM/predict (line 433)
predict@classreg.learning.classif.ClassificationModel(this,X,varargin{:});
Error in multisvm (line 39)
if(predict(models{d},TestSet(t,: )))
if(predict(models{d},TestSet(t,: )))
result1= multisvm(TrainingSet,Group_Train1,TestSet,Group_Test1);
All I had to do was change both imresize() in HOG_NEW and then rerun HOG_NEW and then rerun HOG2 .
With the larger image size, classification took notably longer, but it did finish. Accuracy was less than 21% though.
The existing data is only part of the dataset. But when I used all the data I get an error " out of memory"
Error using cat
Out of memory. Type "help memory" for your options.
Error in HOG_NEW (line 14)
Feat1 = cat(1, Feature{:}); % Or cat(2, ...) ?!
The attached changes will reduce the total memory use, by getting rid of some variables after they are no longer needed.
However, you might still need about 16 gigabytes of memory to process that dataset.
With that extended dataset, Feat1 will be about 8.5 gigabytes, and Feat2 will be just under 2 gigabytes.
The code accumulates Feat1 in pieces and then puts the pieces together, so during the time it is putting the pieces together, it temporarily needs about 16-ish gigabytes (half occupied by the cell array containing the pieces, half occupied by the array that is formed by putting all the pieces together.) It would be possible to rewrite that to not use the cell array.
The hog_feature_vector function could be rewritten to improve speed and reduce memory use, by the way. The hog_feature_vector construction takes most of the time.
Running the classification phase is taking about 23 gigabytes.
You only have 12 gigabytes on your computer, but the size of HOG training feature database is about 8.5 gigabytes and the size of the HOG test feature database is about 2.0 gigabytes, and the way your code is arranged, both of those need to be in memory at the same time.
You can reduce use a bit: after you have trained, you can remove the training data. If you split up training and testing, then you could set things up so that you do not load the test data until after you have cleared the training data; that would reduce your peak memory use.
But you should pretty much expect that the SVM training process is going to need temporary variables about as large as the training data, so you should expect that you are going to need about 17 gigabytes to do this work, if you still need the 400 x 200 image sizes.
I have been running the classification task for the last day. A short time ago it peaked at over 100 gigabytes of memory used (it is swapping to disk.) On your system with 12 gigabytes of memory, the task would be pretty much impossible (and would take days.)
Before going ahead, you need to concentrate on improving the performance of the code.
For example: you only need one trained classifier to exist at a time. You train a classifier, and use it to predict against all the remaining test samples; any rows that match get recorded and then get removed from the test sample, and the classifier can then be thrown away.
Note: you can run any one classifier against an array of samples, instead of running predict against only one sample at a time; you just have to change your if logic into logical indexing.
sun rise
sun rise on 26 Aug 2021
Edited: Walter Roberson on 26 Aug 2021
When the image was resized to 400 * 100
Error using classreg.learning.FullClassificationRegressionModel.prepareDataCR (line 231)
X and Y do not have the same number of observations.
Error in ClassificationSVM.prepareData (line 632)
classreg.learning.FullClassificationRegressionModel.prepareDataCR(...
Error in classreg.learning.FitTemplate/fit (line 217)
this.PrepareData(X,Y,this.BaseFitObjectArgs{:});
Error in ClassificationSVM.fit (line 240)
this = fit(temp,X,Y);
Error in fitcsvm (line 334)
obj = ClassificationSVM.fit(X,Y,RemainingArgs{:});
Error in multisvm (line 29)
models{k} =
fitcsvm(TrainingSet,G1vAll,'KernelFunction','polynomial','polynomialorder',3,'Solver','ISDA','Verbose',0,'Standardize',true);
Error in HOG2 (line 15)
result1= multisvm(Feat1,Group_Train1,Feat2,Group_Test1);
I let the code run for about 40 hours. It was up to 132 gigabytes of memory. I got tired of it and canceled it; it really needs a rewrite.
I changed the imresize() to [400,100] in both places, and reran HOG_NEW, which ran without problem.
I then re-ran HOG_NEW . After about 2 hours I asked it to pause; about half an hour later it did pause, having just completed building the first classification tree out of 937 . Estimated time to build the trees is therefore roughly
days(hours(2.5) * 937)
ans = 97.6042
which is more than 3 months.
I then asked it to apply that classification tree to all of the training data, which took about 20 minutes. That adds another
days(hours(1/3) * 937)
ans = 13.0139
So you should expect your code to take more than 4 months to run. More, as your system is slower than mine.
There is not much you can do to speed up building the classification trees... though possibly dropping the members of each class after that class is trained might help. Results would probably be less robust.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!