Yellowbrick 旨在与 scikit-learn 一起使用,并使用 sklearn 的类型检查系统来检测模型是否适合特定类别的机器学习问题。如果 neupyPNN
模型实现了 scikit-learn 估计器 API(例如fit()
和predict()
) - 可以直接使用模型并通过使用force_model=True
如下参数绕过类型检查:
visualizer = ClassificationReport(model, support=True, force_model=True)
然而,在快速浏览一下neupy 文档后,似乎这不一定有效,因为 neupy 方法是命名的train
,而不是命名fit
的,因为 PNN 模型没有实现score()
方法,也不支持_
后缀学习参数。
解决方案是创建一个轻量级包装器,围绕PNN
模型将其公开为 sklearn 估计器。在 Yellowbrick 数据集上进行测试,这似乎可行:
from sklearn import metrics
from neupy import algorithms
from sklearn.base import BaseEstimator
from yellowbrick.datasets import load_occupancy
from yellowbrick.classifier import ClassificationReport
from sklearn.model_selection import train_test_split
class PNNWrapper(algorithms.PNN, BaseEstimator):
"""
The PNN wrapper implements BaseEstimator and allows the classification
report to score the model and understand the learned classes.
"""
@property
def classes_(self):
return self.classes
def score(self, X_test, y_test):
y_hat = self.predict(X_test)
return metrics.accuracy_score(y_test, y_hat)
# Load the binary classification dataset
X, y = load_occupancy()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Create and train the PNN model using the sklearn wrapper
model = PNNWrapper(std=0.1, verbose=True, batch_size=500)
model.train(X_train, y_train)
# Create the classification report
viz = ClassificationReport(
model,
support=True,
classes=["not occupied", "occupied"],
is_fitted=True,
force_model=True,
title="PNN"
)
# Score the report and show it
viz.score(X_test, y_test)
viz.show()
尽管 Yellowbrick 目前不支持 neupy,但如果您有兴趣 - 可能值得提交一个建议将 neupy 添加到 contrib 的问题,类似于 Yellowbrick 中statsmodels
的实现方式。