sir, although I can't replace the old data, I have tried several times to use the App toolbox (ClassificactionLearner) to evaluate the confusion matrix using the old data but I always fail a month ago already. can you help me in this direction through a code to determine this confusion matrix in order to know the well classified data and the badly classified data? thank you for your continued support attached my code thank you!!

There are two requirements for the "confusionmat" function that are not being fullfilled in your matlab script: The type of the input must be vectors or character matrices. Your inputs are of the type double. You can use the "num2str" function to convert the double type to char array. For example char_test_Coords = num2str(test_Coords) The size of both the character array inputs should be the same. In your script, the length of train_Coords is 120 and the length of test_Coords is 30. Making the above changes should fix the problem.

how to evaluate my result knn code using confusion matrix

merlin toche on 16 Jan 2023

hanks for your help and clarification on the script.

I read well, but I don't understand well (excuse me, I'm still learning machine learning).

Regarding the size of the input vectors, I partitioned my data into training data (80%) and test data (20%). does this mean that the confusion matrix does not take into account the two types of data?

please sir, can you explain me more? Thank you and see you soon

Rajeev on 17 Jan 2023

No worries.

As you have mentioned, the data has been partitioned into training(80%) and testing(20%) sets.

Ideally, the next step is to train you model on the training data. Once trained, you test your model using the rest of the data i.e. the test data (20%).

Let us take an example to understand it better, I have taken this image from the website https://builtin.com/data-science/train-test-split.

For this example, I will use the variables names as given in the image below. The steps one should follow are:

Train your model using the data 'X_train' and 'y_train'. In your case, the (X_train, y_train) is 80% of the total data.
Predict the results of 'X_test' using the model trained in step 1. Store the results in 'y_test_pred'.
Pass 'y_test_pred' and 'y_test' in the confusionmat function. In this case, both 'y_test_pred' and 'y_test 'are of the same dimension.

The results that you obtain by running your model on the test data and the original result that you already have in your test data are given as inputs to the 'confusionmat' function.

NOTE: confusionmat works only for classification problems.

merlin toche on 17 Jan 2023

knncodetoche5.m

Hi mister!

it's really my concern is fading little by little given your patience with me despite my quality of apprentice. thank you so much.

however, I think in my code I have already done steps 1 and 2 as mentioned. please I beg you can you take a look at my code below to check if I have trained my data and stored it? here is my code attached

please forgive me for disturbing you at any time, I'm new to ML, I have to use it to solve my problem. thank you

Rajeev on 17 Jan 2023

Edited: Rajeev on 18 Jan 2023

Hi, I went through your code and I think there are some pieces of code that needs modification. Here are my suggestions:

You are trying to sort the array based on distances. Correct me if I am wrong, are you looking for the top n elements with the least distance along with their classes? If so, then you can replace these lines in your code

distanceofIndex=[];

temp=0;

gemp=0;

for i=1:length(distanceofIndex)

for j=1:(length(distanceofIndex)-i)

if(distanceofIndex(j)>distanceofIndex(j+1))

temp=distanceofIndex(j);

distanceofIndex(j)=distanceofIndex(j+1);

distance(j+1)=temp;

gemp=trainClass(j);

trainClass(j)=trainClass(j+1);

trainClass(j+1)=gemp;

end

%4.take first k element from the c array now

k=5;

classy=[];

for i=1:k

classy=[classy trainClass(i)];

end

with

distanceofIndex = distancesOfTheIndexes(:,1);

pred_classes = trainClass(indexes(:,1));

dist_class_aug_matrix = [distanceofIndex, pred_classes'];

dist_class_aug_matrix = sortrows(dist_class_aug_matrix);

% to get the top n elements from the sorted c array

n = 7 % for example

classy = dist_class_aug_matrix(1:n,2)

Now for the part where you are trying to calculate the confusion matrix, you should feed the classes to the confusionmat function instead of the coordinates. Refer to the documentation for more Compute confusion matrix for classification problem - MATLAB confusionmat (mathworks.com).

But it seems like you do not have the classes for the test_Coords data. The size of the trainClass vector is 120 instead of 150. If you had the classes for train_Coords as well, then you could simply pass these classes with the prediceted one (pred_classes) to get the desired confusion matrix.

merlin toche on 17 Jan 2023

Hello my dear ! your explanation is masterful and educational, thank you very much. I read your suggestion with joy and I fully approve. however I tried to replace the piece of code you suggested, but here is the message that appears! please can you help me? apologize for the inconvenience. I need it to keep moving forward and above all, understand me, I'm still learning. thank you and see you soon my dear

Rajeev on 18 Jan 2023

You are getting the error because of the line number 60.

That line is not required and is incorrect as well. Removing that line will fix the error.

If you look at the error, it says that the variable 'distanceofIndex' is undefined. This means that you have not declared this variable yet and are assigning it to some other variable. Line number 61 declares and initialize the variable in one go.

The knnsearch function returns you the index and distance to the coordinate of train_Coords that is closest to your input coordinate test_Coords. Since there are 5 classes, the output contains the coordinates for closest distances to each of the class. But only the first column of the output is relevant because the rows of the output matrix are sorted in ascending order. That is, the first column contains the distances to the nearest point of the input coordinates and consequently. Using the indexes of the first column and the trainClass vector gives us the classes of the input data.

merlin toche on 18 Jan 2023

Hi @Rajeev

thank you for your prompt response!

in fact I inserted line number 60 because the same error message was displayed with line number 61. I deleted this line currently, but the same error message appears, really I m sorry to bother you

thank you for your helping

Rajeev on 18 Jan 2023

Hi, thanks for pointing it out. I have edited the code to remove the error.

I forgot to take the transpose of the pred_classes matrix.

merlin toche on 18 Jan 2023

From what level did you do it ? please

Rajeev on 18 Jan 2023

Edited: Rajeev on 18 Jan 2023

I have changed the third line of the code snippet from

dist_class_aug_matrix = [distanceofIndex, pred_classes];

to

dist_class_aug_matrix = [distanceofIndex, pred_classes'];

To get the transpose of a matrix in MATLAB, apostrophe is added at the end of the name of the matrix. a' is the transpose of a.

merlin toche on 18 Jan 2023

thank you for your prompt response sir @Rajeev

I have indeed taken into account the transpose of pred_class, but still error message. I don't understand anything sir. it's related to the fact that my R2015a version is older or you can have other alternatives

thank you again for your unwavering support . line number 63, i added apostroph at the end of pred_classes

Rajeev on 18 Jan 2023

This is because there is a mistype in the variables name.

In line 61 and 63 distancesofTheIndexe must be replaced with distancesofTheIndexes. You have missed an 's' at the end.

In line 62, replace indexe with indexes.

merlin toche on 19 Jan 2023

Thank you my dear

it work

Rajeev on 19 Jan 2023

I am glad it worked. If you found the answer useful, you can mark it as accepted so that if others also have the same issue, they can be reassured that the answer worked for the OP.

merlin toche on 19 Jan 2023

Thank you very much @Rajeev

indeed, that is the line with your help walked.

I'm disturbing you for the rest please excuse me

for the question about the trainClass size of my code you're right it's 120 elts instead of 150, but i figured that as i partitioned the dataset into training and test data the classes should also be partitioned into trainClass and testClass (this is not the case), maybe by trying to have this, we can determine the confusion matrix.

please how to get trainClass and testClass?

thank you again for your availability and your forbearance

Rajeev on 19 Jan 2023

Edited: Rajeev on 19 Jan 2023

Since the data set is just random points in the 2-D plane, there may or may not be well defined clusters.

If I am understanding it correctly, your script is trying to assign random classes to random coordinates. Instead of doing this, what can be done is to generate data in clusters and then partition.

I have written a simple script that you can run and take reference from it. I have also attached the helper function to create coordinates.

In the scripts the coorinates are not overlaping, you can make them overlaping to get a non-ideal confusion matrix.

Image Analyst on 23 Jan 2023

confusionchart is in the stats toolbox. Do you have that?

merlin toche on 23 Jan 2023

NO sir i don't have it., can you guide me please

but i tried to do it with classifcationLearner, but i can't

Image Analyst on 23 Jan 2023

Classification Learner is also in the Statistics and Machine Learning Toolbox. If you don't have the toolbox, you won't even see Classification Learner, let alone try to use it. When you type

>> ver

do you see the stats toolbox listed?

merlin toche on 23 Jan 2023

I said because I did not find a toolbox named stat toolbox

however it is in the App toolbox that I find classificationLearner

Image Analyst on 23 Jan 2023

Open in MATLAB Online

There is no "App" toolbox. The App is a tab on the tool ribbon that has applets for the various toolboxes. If you have the toolbox installed, there will be applets for it listed on the Apps tab. Because you can see it there, it indicates that you have the stats toolbox installed but because you cannot run any stats functions it means you do not have a valid license for it, even though it's installed. What does this show:

hasLicenseForToolbox = license('test', 'Statistics_Toolbox');	% Check for Statistics and Machine Learning Toolbox.
which -all confusionchart % See if this function is installed.

merlin toche on 23 Jan 2023

the response is 1 does this mean that the stat toolkit is installed! I believe that is it or am i wrong

Image Analyst on 23 Jan 2023

That means you have it, and it means you should have the confusionchart function. What does this show?

>> which -all confusionchart

C:\Program Files\MATLAB\R2022b\toolbox\shared\mlearnlib\confusionchart.m

C:\Program Files\MATLAB\R2022b\toolbox\stats\bigdata\@tall\confusionchart.m % tall method

merlin toche on 24 Jan 2023

'confusionchart ' not found, this is message that I get when i enter which-all confusionchart in the console

merlin toche on 24 Jan 2023

myknnessay.m

Hi all

please, can anyone help me with reviewing my code below and provide suggestions if possible?

Indeed, I want to classify five faults whose characteristics are voltage and current data. so I have 5 classes, my class data is too scattered, I don't understand

attached my code

thank you for all the effort you put in for me

Rajeev on 24 Jan 2023

Open in MATLAB Online

The error in the code was because of the transpose on 'y_est'. Changing the line from

Cm=confusionchart(y_test, y_est');

to

Cm=confusionchart(y_test, y_est);

will solve the issue.

Also, in the plot, the cyan dots are plotting data_class_3 instead of data_class_5. This seems like a copy paste typo.

The data is scattered because of the constraints that were given to the random number generator. There is nothing wrong with the plot as it follows the contraints as expected.

What kind of plot are you expecting for your problem?

What is the actual problem statement?

merlin toche on 24 Jan 2023

thank you@Rajeev

as for 'confusionchart', it seems to me that it is not integrated in my R2015a version, since before doing the transpose of y_est I tried without transposing, but the same error message appeared.

however, you're right, that's the typing error for the cyan dots.

Thank you for your two questions, it allows me to explain my problem to you again.

I want to classify a new data in a class by majority vote after calculating the distances (it's knn I believe)

my problem is related to the detection of 5 faults on the basis of two parameters (current and voltage measurement)

your ability to detect and solve problems fascinates me sir @Rajeev, a lot of pedagogy in your approach. Thank you again for your help

Rajeev on 25 Jan 2023

Can you update the MATLAB version to R2018b or later so that you can use the 'confusionchart' function?

If, for some reason, you cannot update the MALTAB version, you can simply use the 'confusionmat' function to get the matrix as the output and view it in the command line.

The problem can be solved using knn if the data points that belongs to the same fault are in close proximity. That is, the points of the same class are near to each other.

Given that the assumption mentioned above is followed, you can simply use the script 'myknnessay' after correcting the typo.

merlin toche on 25 Jan 2023

myknnessay.m

@Rajeev thank'you again

i think i will try this version, thanks

yes I used the 'confusionmat' function, the output matrix showed up.

as an expert, you reassure me that this code is good for the detection of my 5 faults, there is no more error if I understand you correctly? now using classificationLearner can I get the same result?

i tried to plot the x_train and x_test points ,i have this image but i don't understood that

merlin toche on 25 Jan 2023

excuse i forget this question!

how to plot x_train and x_test of any data_class? i tried it but i have this error:

Subscript indices must either be real positive integers or logicals.

Error in myknnessay (line 98)

thank'you

Rajeev on 25 Jan 2023

Edited: Rajeev on 25 Jan 2023

Open in MATLAB Online

myknnessay.m

I am not getting the error when I ran your script. Although, I did get the plot that you have attached in the comment. This plot is incorrect as

plot(x_train,'b.','linewidth',2,'MarkerSize',10)

plots all the coordinates against [1:length(x_train)]. To plot the coordinates of x_train, you need to give the data for x axis and y axis separately like:

plot(x_train(:,1),x_train(:,2),'b.')

Here is the output of test data and train data. As indicated in the figure, the test data belongs to each class of the data. The confustion chart shows that all the predictions made by 'knnsearch' are correct.

Fig 1: The original dataset as per your constraints.

Fig 2: Visual representation of split of test and train data (i.e. x_train and x_test).

Fig 3: The confusion matrix for the faults predicted by the function knnsearch.

I have also attached the file that can plot the results as indicative in the screenshots. Since, confusionchart doesn't work for you, you can simply use the confusionmat and view it in the command line window to get the matrix.

merlin toche on 25 Jan 2023

@Rajeev

perfect thank you very much. it work! but i have a concern, i don't have the same visualization on figure1 and figure2

look at what I get with my R2015a version

figure1

figure2

Rajeev on 27 Jan 2023

The plots that you have attached does not seem to follow the constraints that you have specified before.

Can you share the script which is responsible for generating the data along with any helper function that you might have used?

merlin toche on 27 Jan 2023

Hi @Rajeev

yes, i attached it

thank you!

Rajeev on 30 Jan 2023

I ran your script in my system and I am getting the expected output. If possible, try updating the MATLAB version and try it again.

merlin toche on 30 Jan 2023

thanks for your feedback @Rajeev

please, if i don't constrain and just use my data set is it not as valid for my fault detection. I want to see how it works without constraint please. attached my dataset

thank'you very much for your help

Rajeev on 30 Jan 2023

plot_dataset.m

Upon plotting and observing your dataset, it can be seen that the datapoints that belongs to the same class are not close to each other. It is necessary for them to be in clusters for knn search algorithm to work.

Regardless, if the dataset is passed to the knnsearch function, the results are:

As, you can see, the confusion matrix now shows a lot of error/misclassification as the data is not clustered well.

I have also attached the code for your reference.

merlin toche on 30 Jan 2023

What will I say again beyond thank you, sir @Rajeev receive all my gratitude for your competence and your pedagogy in the transmission of knowledge to us who needed. you have a big heart! unfortunately for me the version R2015a I have fails to read the code you sent. however while waiting to change my R2015a version, can I continue if something still bothers me to turn to you and your team?

a concern please: is it necessary to use a threshold for fault detection with kNN? if yes how to do it?

thank you again and again for all you do for me

how to evaluate my result knn code using confusion matrix

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

38 Comments
Show 36 older comments Hide 36 older comments

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

how to evaluate my result knn code using confusion matrix

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

38 Comments Show 36 older comments Hide 36 older comments

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

38 Comments
Show 36 older comments Hide 36 older comments