My Knn Algorithm Code and while loop is not working. Please help ???

4 views (last 30 days)
Hello,
I want to ask something about knn algorithm in matlab. I'm trying to use kNN Classification and ı want to find best k with matlab. I did some little loop but it's just working with small data.
My data is called as "Adult Data Set". I take this data on the internet. I have 36179 objects as training data and I have 9044 objects as test data.
I want to make prediction for test data and find the best k value according to the error.
Here is my code:
If you can help me, I will very glad. It's not give any error but also not give any output.
training = xlsread('AdultDataTraining.xlsx','a2:n36179');
trainingclass =xlsread('AdultDataTraining.xlsx','o2:o36179');
test = xlsread('AdultDataTest.xlsx','a2:n9045');
testclass = xlsread('AdultDataTest.xlsx','o2:o9045');
num = 9044;
list_prediction = [];
class_list = [testclass];
error_list = [];
i = 1;
while i <= 36178
dissimilarity = 0;
knn = fitcknn(training,trainingclass,'NumNeighbors',i,'Standardize', 1);
prediction_test = predict(knn, test(:,:));
list_prediction = [list_prediction, prediction_test];
k = 1;
while k <= length(class_list)
if class_list(k) == list_prediction(k)
num = 0;
dissimilarity = dissimilarity + num;
else
num = 1;
dissimilarity = dissimilarity + num;
end
k = k + 1;
end
error = dissimilarity / num;
error_list = [error_list, error];
i = i + 1;
end
min(error_list)
  1 Comment
Image Analyst
Image Analyst on 2 Jan 2021
You forgot to attach your workbook files with the paper clip icon. I'll check back later for them.

Sign in to comment.

Answers (1)

Anay
Anay on 2 Apr 2025
Hi Eda,
I understand that you are not getting any output when you are trying to use KNN algorithm to run predictions on your data. Since you have not provided your data files, I cannot tell if there is any issue with your data. But in the code that you have provided, there seems to be an issue which can be a likely cause for this issue.
The variable “num”, which represents the number of samples in test data set, is being reset in your code unintentionally. This causes wrong calculation of the error and also poses the risk for a divide by zero to occur:
while k <= length(class_list)
if class_list(k) == list_prediction(k)
num = 0;%value of num is getting reset
dissimilarity = dissimilarity + num;
else
num = 1;%value of num is getting reset
dissimilarity = dissimilarity + num;
end
k = k + 1;
end
error = dissimilarity / num;%risk of divide by zero
Also, you are iterating from K = 1 to K = number of training samples which in your case is 36178, where K is the number of nearest neighbours. Having value of K that big for KNN is computationally expensive and requires substantial memory for training and prediction. You can consider having a smaller value for the maximum K for your data, like K = 20.
I hope this helps to solve the issue!

Categories

Find more on Parallel Computing in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!