查准率(Precision):所有诊断为患病(1)样本中实际为患病的比率。
召回率(Recall):所有患病样本中被发现并诊断为患病的比率。
查准率 = TP/(TP+FP)
召回率 = TP/P = TP/(TP+FN)
敏感性 = TP/P = TP/(TP+FN)
特异性 = TN/N = TN/(TN+FP)
F1-Score = 2*Precision*Recall/(Precision+Recall)
可以利用CV集通过比较不同判断阈值得到的F1来判断用哪个阈值更好。
查准率、召回率、敏感性、特异性和F1-score的计算及Matlab实现:
%样本标记为0和1,num为选取前n个特征的数据用于分类
%需要安装好SVM
function [sens,spec,F1,pre,rec,acc] = SEERES(train,trainclass,test,testclass,num)
acc = zeros(num,1);
sens = zeros(num,1);
spec = zeros(num,1);
F1 = zeros(num,1);
pre = zeros(num,1);
rec = zeros(num,1);
FeatureNumber = zeros(num,1);
[len,b]=size(testclass);
for n=1:num
label = trainclass;
data = train(:,1:n);
testlabel = testclass;
testdata = test(:,1:n);
model=svmtrain(label,data,'-s 0 -t 0 -b 1');%默认C-S