如本期
tensorflow 的流式召回和精度并不意味着它们应该意味着什么
predictions = tf.argmax(logits, 1)
labels = tf.squeeze(labels)
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
'Accuracy': slim.metrics.streaming_accuracy(predictions, labels),
'Precision': slim.metrics.streaming_precision(predictions, labels),
'Recall': slim.metrics.streaming_recall(predictions, labels),
'Recall_5': slim.metrics.streaming_recall_at_k(logits, labels, 5),
'Recall_3': slim.metrics.streaming_recall_at_k(logits, labels, 3),
'Recall_1': slim.metrics.streaming_recall_at_k(logits, labels, 1),
})
结果就像
2018-03-06 12:45:43.520961: I tensorflow/core/kernels/logging_ops.cc:79] eval/Recall_1[0.664843738]
2018-03-06 12:45:43.521368: I tensorflow/core/kernels/logging_ops.cc:79] eval/Recall[0.990521312]
2018-03-06 12:45:43.521429: I tensorflow/core/kernels/logging_ops.cc:79] eval/Recall_5[0.857031226]
2018-03-06 12:45:43.521487: I tensorflow/core/kernels/logging_ops.cc:79] eval/Precision[0.996820331]
2018-03-06 12:45:43.521537: I tensorflow/core/kernels/logging_ops.cc:79] eval/Accuracy[0.664843738]
2018-03-06 12:45:43.521584: I tensorflow/core/kernels/logging_ops.cc:79] eval/Recall_3[0.809375]
为什么 streaming_recall 和 streaming_precision 都是 99%,而准确率和 top 1 召回率是 66%。
某些事情与我们所知道的召回率和精度的已知含义严重不同 。为什么准确率与recall_1 相同,为什么recall 和recall_1 不同?
问题是如何更新 slimeval_image_classifier.py
以使其计算非布尔值的 streaming_recall 和 streaming_precision 以及 f1 分数?