What exactly the ROC curve can tell us or can be inferred?

Hi Smart Guys,
I wrote some codes to run a linear discriminant analysis based classification:
%%Construct a LDA classifier with selected features and ground truth information
LDAClassifierObject = ClassificationDiscriminant.fit(featureSelcted, groundTruthGroup, 'DiscrimType', 'linear');
LDAClassifierResubError = resubLoss(LDAClassifierObject);
Thus, I can get
Resubstitution Error of LDA (Training Error): 1.7391e-01
Resubstitution Accuracy of LDA: 82.61%
Confusion Matrix of LDA:
14 3
1 5
Then I run a ROC analysis for the LDA classifier:
% Predict resubstitution response of LDA classifier
[LDALabel, LDAScore] = resubPredict(LDAClassifierObject);
% Fit probabilities for scores (the groundTruthGroup contains lables either 'Good' or 'Bad')
[FPR, TPR, Thr, AUC, OPTROCPT] = perfcurve(groundTruthGroup(:,1), LDAScore(:,1), 'Good');
I have got:
OPTROCPT =
0.1250 0.8667
Therefore, we can get:
Accuracy of LDA after ROC analysis: 86.91%
Confusion Matrix of LDA after ROC analysis:
13 1
2 7
My questions are:
1. After ROC analysis we obtained a better accuracy, when we report the accuracy of the classifier, which value we should use? What exactly the ROC curve can tell us or can be inferred? Can we say after ROC analysis we found a better accuracy of the LDA classifier?
2. Why the ROC can produce a better accuracy for the classifier, but the original ClassificationDiscriminant.fit can't?
3. I have also done a cross validation for the LDA classifier, like
cvLDAClassifier = crossval(LDAClassifierObject, 'leaveout', 'on');
Then how to get the ROC analysis for the cross validation? 'resubPredict' method seems only accept 'discriminant object' as input, then how can we get the scores?
4. classperf function of Matlab is very handy to gather all the information of the classifier, like
%%Get the performance of the classifier
LDAClassifierPerformace = classperf(groundTruthGroup, resubPredict(LDAClassifierObject));
However, anyone knows how to gather these information such as accuracy, FPR, etc. for the cross validation results?
Thanks very much. I am really looking forward to see the reply to above questions.
A.

 Accepted Answer

1. You can report anything you like as long as you report an estimate obtained by cross-validation or using an independent test set. You can fine-tune a classifier on the training set, but then its accuracy measured on the same set is biased up.
2. Sure, you get a different accuracy by using a different threshold for assigning into the positive class.
3. All loss methods for classifiers return by default the classification error, not the mean squared error. This is stated in many places in the doc.
4. You have code in your post to obtain a ROC by from resubstitution predictions. Just replace resubPredict with kfoldPredict.
5. Any estimate of classification performance should be obtained using data not used for training. Otherwise the estimate is optimistic. For simple models like LDA, the optimistic bias may be small. Yet it's there.

4 Comments

3. All loss methods for classifiers return by default the classification error, not the mean squared error. This is stated in many places in the doc.
Regarding to the 'default the classification error of the cross validation', if it is not the mean squared error, what is it? Is it the average of the classification error of each fold of cross validation? I couldn't find a place in Matlab Help saying this.
4. You have code in your post to obtain a ROC by from resubstitution predictions. Just replace resubPredict with kfoldPredict.
Why we need to use kfoldPredict? As I wrote a loop for the cross validation, should we use the resubPredict for each loop?
Sorry to bother you but I am really novice to this. Thanks a lot for your help.
A.
Just tried to use kfoldPredict from the outside of the loop.
cvLDAClassifierFit = kfoldPredict(cvLDAClassifier);
The results cvLDAClassifierFit is a cell that contains labels "good" and "bad". Then how can we calculate the mean squared error?
mean( (cvLDAClassifierFit - groundTruthGroup).^2 )
Undefined function 'minus' for input arguments of type 'cell'.
Here is one of relevant doc pages: http://www.mathworks.com/help/stats/classificationpartitionedmodel.kfoldloss.html If you scroll down to definitions, you will see 'The default classification error is the fraction of the data X that obj misclassifies, where Y are the true classifications.' You get predicted labels and posterior probabilities by cross-validation and then you count how often the predicted label and true label disagree.
You use this code to get a ROC curve by resubstitution:
[LDALabel, LDAScore] = resubPredict(LDAClassifierObject);
[FPR, TPR, Thr, AUC, OPTROCPT] = perfcurve(groundTruthGroup(:,1), LDAScore(:,1), 'Good');
Use this code to obtain the ROC by cross-validation:
cvLDA = crossval(LDAClassifierObject);
[LDALabel, LDAScore] = kfoldPredict(cvLDA);
[FPR, TPR, Thr, AUC, OPTROCPT] = perfcurve(groundTruthGroup(:,1), LDAScore(:,1), 'Good');
If class labels are cell arrays of strings, classification error is
mean(~strcmp(cvLDAClassifierFit,groundTruthGroup))
I don't understand what groundTruthGroup(:,1) means in your post by the way.
So, if I understand correctly, you are using the posterior probability computed for one classifier as a new classifier to get an improved accuracy. What is the rationale behaind this? Do you have a reference? Many thanks, Marta

Sign in to comment.

More Answers (1)

1-2. You can use either accuracy. The accuracy obtained by LDA is for assigning every observation into the class with the largest posterior. For two classes, this is equivalent to setting the threshold on the posterior probability for the positive class to 0.5. ROC analysis lets you optimize this threshold and therefore obtain a better accuracy.
The improvement obtained by the ROC analysis in your case is not statistically significant. For a small sample like yours, you would have trouble demonstrating (convincingly) superiority of one classifier over another. Look up the sign test. Let n01 be the number of observations misclassified by the 1st model and correctly classified by the 2nd model, and let n10 be the other way around. Then 2*binocdf(min(n01,n10),n01+n10,0.5) gives you a p-value for the two-sided test of equivalence for the two models.
3. Type methods(cvLDAClassifier) to see all methods of the cross-validated object (use properties to see its properties) or read the class description in the doc. The kfoldPredict method is what you want.

2 Comments

Hi Ilya,
Thanks a lot for your reply.
1. Here is a test of my toy problem. And in real case I may obtained more data. Also, even when I have more data it is still possible that there is no significant difference between the original accuracy of the LDA and the accuracy of its ROC analysis, I think we can only report the LDA one in order to prevent misleading readers. Am I right?
2. It seems you treat LDA classifier and ROC analysis as 'TWO' models or classifiers, do I understand you correctly? So the ROC analysis is an improved model of the original LDA?
3. By looking at the example of kfoldPredict,
yfit = kfoldPredict(cvLDAClassifier);
cvLDAClassifierErrError1 = mean( (yfit - LDAClassifierObject.Y).^2 )
Is that equal to use
cvLDAClassifierErrError = kfoldLoss(cvLDAClassifier);
directly?
4. Still not clear how to get the ROC of the cross-validation. By looking at this example of using Python I realize that I may need to write a loop for doing this and also calculate the average/mean of the ROC obtained from each fold of the cross-validation.
However,
indicesVal = crossvalind('Kfold', numFeatures, 10);
for cvIter = 1:numFeatures
testIndices = (indicesVal == cvIter);
trainingIndices = ~testIndices;
end
The number of training may be different in each fold of cross-validation because crossvalind returns randomly generated indices for a K-fold cross-validation. Thus, the number of points on the ROC curve of each fold is different. Then how to calculate the average?
5. Still not sure if it is necessary to get the ROC of the K-fold cross-validation. Because if the cross-validation model has no statistical significant difference as original model. That makes no sense to run the ROC, or should we always run the ROC and test/report the statistical significant difference (p-value)?
Thanks again for your help.

Sign in to comment.

Asked:

on 18 Mar 2013

Commented:

on 8 May 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!