Question about KNN and how to use it.

Hello,
I need to use KNN in matlab to find some results.
I have data in .mat that has this kind of information(training data)
1|232|34|21|542|
2|32|333|542|32|
and so on.
Then i have a second information that i will gather thro the aplication but i will only get let say
a=10|343|543|43|23
So now my question is do i only need to do is something like this http://www.mathworks.com/help/toolbox/stats/classificationknnclass.html
Best regards.

4 Comments

Are the "|" literal strings there, and everything is character? Or does "|" represent column breaks in a numeric matrix ?
it just represents column brakes you know like in excell. Every number is in it's own box.
Your question does not contain enough information to be answered.
What does "a" represent? Is it the "response", one entry per row of your training data?
You ask if you only need to apply a particular routine, but you do not indicate what you are trying to do, what your desired outcome is.
Ok, sorry for the lack of information. It's a fruit recognization system. It gets data like RGB and roundess.
  • First number (1,2,3,....) is the fruits code
  • Second three numbers are RGB
  • Last number is roundness (0.434,0.454,....)
Now i have fruit.mat where i store the above information via a GUI I made. So there are arround 60 rows of information for 5 fruits.
My assigment is to create an aplication that recognizes fruit by color and roundness(I have done that), second is to store that data(done that), third is to then get a picture of a fruit and let the program using KNN tell you what fruit it is.
The only part that i have not done is the KNN. So i need to use stored data and data gathered in real time so it will tell me what fruit it is.
so lets say i have
newdata=[x,y,z] - gathered in real time (1|223|45|5|0.34)
q=load(fruit.mat)
olddata=q.data

Sign in to comment.

 Accepted Answer

Ilya
Ilya on 4 Sep 2012
Edited: Ilya on 4 Sep 2012
Take a look at the User Guide for k-NN classification: http://www.mathworks.com/help/toolbox/stats/bsehyju-1.html
In your case, the fruit code would be the class label Y and the 3 RGB numbers and roundness would be predictors X.
If you are stuck, follow up with a specific question.

9 Comments

Hi, thanks for the link. But i don't understand what you said about fruit code being Y, RGB and roundness being X ? Shouldn't it be that in KNN you combine test and train tables to get a result ?
In my case train data is fruit.mat where i have a table that looks like this
1 223 43 23 0.89
2 22 143 123 0.69
and so on.
  • The first nuber is a code(1) let's say for green apple.
  • 223 43 23 are RGB
  • 0.89 is roundess
Now via a GUI that asks you select a picture, click on it (to chose a collor) then pres enter it gets only in case of an orange
3 202 34 67 0.87
Now that is test data or data that is going to bi used to look into fruit.mat to find the closest fruit.
Ill read the information of the link you sent me. The above is just to clear things up if you understud me wrong. Or maybe i understud you wrong :) .
You asked a question about ClassificationKNN and then you said "My assigment is to create an aplication that recognizes fruit by color and roundness". I interpret this as follows. You have a classification problem in which you want to predict the type of fruit based on its color represented by 3 rgb values and roundness. The type of fruit is then the class label, and color and roundness are the predictors. ClassificationKNN.fit uses X notation for predictors and Y notation for class labels. I don't know what you mean by "code". If it is not the same as "fruit type" and not used as a predictor either, it is irrelevant as far as classification goes. You need to form X and Y to pass to ClassificationKNN.fit and then you can use the k-NN model returned by ClassificationKNN.fit for making predictions.
asd
asd on 7 Sep 2012
Edited: asd on 8 Sep 2012
OK , i understand now.I put fruit code so i don't need to put let say green apple. There is a list of fruits and their codes that is shown in fromt of you. You just need to find the fruit and type in the number that is by the fruit name. In case of green apple the number is 2.
I need to seporate fruit code to use KNN clasification i saw in the example that you linked, they have species and means in a seporated table.
So i must seporate Codes in one table and RGB and roundness in another table? Do i need to classify both the data from fruit.mat and the data gathered in real time ? Or just the data in fruit.mat?
NOTE:
Is classificationKNN.fit only in 2012a? I have 2010a.
Please can someone help me i don't know how to do this.
ClassificationKNN was introduced in R2012a. If you have Bioinformatics toolbox, you can use knnclassify in earlier releases. You can also search on File Exchange.
The doc page I have pointed you to gives you the full workflow:
Construct a KNN Classifier
Examine the Quality of a KNN Classifier
Predict Classification Based on a KNN Classifier
I am sorry, but I cannot explain it more clearly than the doc page does.
Can you just answer these questions please.
1. Do i need to use KNN Classifier for both fruit.mat and data gathered in real time ? In fruit.mat i have around 60 rows of information. Real time data i have only one row of information.
2. About roundess it goes from 0.0-1 so is that a to small of margin will it be used because RGB goes from 1-255.
The knn classifier will return cluster centroids. When you get an individual real-time row, calculate the distance from the row to each of the centroids, and the one that is closest is the classification.
1. I don't understand your question. As I said in an earlier reply and as the doc page suggests, first you need to train a k-NN classifier using data with known class labels (training data). Then you can use the trained classifier to predict the class label for data without known class labels (new data).
The choice of a classifier is up to you. You can go with k-NN or you can choose something else, for example, decision tree.
2. For k-NN, it is advisable to standardize variables. You can use zscore function for that.
The k-NN classifier does not compute cluster centroids. An observation is classified to the class most popular among its k neighbors found in the training data. You can code a simple implementation of the k-NN classifier using knnsearch or pdist2 functions from Statistics Tlbx.
asd
asd on 11 Sep 2012
Edited: asd on 11 Sep 2012
Hi,
Finaly i have no errors and i have some results.Again i have 2 questions
1. When i use zscore on my fruit.mat where i have 60 rows of information like :
234 12 2 0.23
231 10 5 0.35
and so one ...
(First 3 numbers RGB, last one roundess)
I get values like :
1.2 -0.98 -1.112 -0.98
And so on...
Is that normal range of numbers ?
2. I use this line to get the result from knnsearch
idx=knnsearch(test.X,Q);
Just to verify idx will return the row from fruit.mat that is the closest one to the Q(real time data,or the tested fruit) ?

Sign in to comment.

More Answers (0)

Tags

Asked:

asd
on 31 Aug 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!