1

我已经开始使用精度和召回率评估随机森林分类器。然而,尽管分类器的 CPU 和 GPU 实现的训练集和测试集相同,但我看到返回的评估分数存在差异。这是偶然在图书馆中的一个已知错误吗?

两个代码示例都在下面供参考。

Scikit-Learn (CPU)

from sklearn.metrics import recall_score, precision_score
from sklearn.ensemble import RandomForestClassifier

rf_cpu = RandomForestClassifier(n_estimators=5000, n_jobs=-1)
rf_cpu.fit(X_train, y_train)
rf_cpu_pred = clf.predict(X_test)

recall_score(rf_cpu_pred, y_test)
precision_score(rf_cpu_pred, y_test)

CPU Recall: 0.807186
CPU Precision: 0.82095

H2O4GPU (GPU)

from h2o4gpu.metrics import recall_score, precision_score
from h2o4gpu import RandomForestClassifier

rf_gpu = RandomForestClassifier(n_estimators=5000, n_gpus=1)
rf_gpu.fit(X_train, y_train)
rf_gpu_pred = clf.predict(X_test)

recall_score(rf_gpu_pred, y_test)
precision_score(rf_gpu_pred, y_test)

GPU Recall: 0.714286
GPU Precision: 0.809988
4

1 回答 1

0

更正:意识到精度和召回的输入顺序错误。(y_true, y_pred)根据Scikit-Learn文档,顺序始终是。

更正的评估代码

recall_score(y_test, rf_gpu_pred)
precision_score(y_test, rf_gpu_pred)
于 2018-11-13T23:33:56.190 回答