How do I do weighted classification?
64 views (last 30 days)
I'm using classifiers in Matlab (e.g. [fitcsvm](<http://ch.mathworks.com/help/stats/fitcsvm.html>) or [fitcknn](<http://ch.mathworks.com/help/stats/classificationknn-class.html))>. Because I have highly unbalanced classes (10% negative class and 90% positive class), I would like to use weighting. Usually I calculate the weight for class i as follows:
weight_i = numSamples / (numClasses * numSamplesClass_i)
That means the total number of observations divided by the product of the number of classes and the number of samples for class i.
Matlab offers the 'Weights' flag to set weights for each observation. But in the description the following is written:
The software normalizes Weights to sum up to the value of the prior probability in the respective class.
I'm completely unsure how I should now use the weights. Can I just set the weight calculated from the above formula for each data point according to its class belonging?
MHN on 21 Apr 2016
You can easily change 'prior' to 'uniform'. 'uniform' sets all class probabilities equal. The default value is 'empirical' which determines class probabilities from class frequencies in Y. For example if you are using decision tree as a classifier then:
tree = fitctree(X,Y, 'prior', 'uniform')
More Answers (1)
MHN on 26 Apr 2016
It depends on your evaluation criteria and does not have a straight forward answer. I suggest you to try them and see which gives you the best answer according to your evaluation criteria.