python - 序列化 adaboost 分类器 scikit-learn

Question

我正在尝试使用 scikit-learn AdaBoostClassifier，我正在尝试使用 cPickle 序列化输出分类器以将其保存到数据库或文件中，但是我出现内存不足错误，当我使用 marshal 时，它给了我不可分解的物体。所以，我想知道如何序列化这个学习分类器。

def adboost_classify(X,Y):
   bdt = AdaBoostClassifier(DecisionTreeClassifier(max_depth=10),
                    algorithm="SAMME.R",
                     n_estimators=3000)
   t0 = time()
   bdt.fit(X, Y)
   t1 = time()
   thebytes = cPickle.dumps(bdt)

先感谢您

score 0 · Accepted Answer

这是因为您试图将整个表示存储在内存中。尝试直接将其写入文件：

with open('adaboostpickled.tmp', 'w') as output:
    cPikle.dump(bdt, output)

python - 序列化 adaboost 分类器 scikit-learn

1 回答 1

Related

Reference