python - scikit-learn (SVMLIB) 中奇怪的 SVM 预测性能

Question

我在 10000x1000 的大型数据集（10000 个具有 1000 个特征的对象）上使用来自 scikit-learn 的 SVC。我已经在其他来源中看到 SVMLIB 不能扩展到超过 ~10000 个对象，我确实观察到了这一点：

training time for 10000 objects: 18.9s
training time for 12000 objects: 44.2s
training time for 14000 objects: 92.7s

你可以想象当我尝试 80000 时会发生什么。然而，我发现非常令人惊讶的是，SVM 的 predict() 比训练 fit() 花费的时间更多：

prediction time for 10000 objects (model was also trained on those objects): 49.0s
prediction time for 12000 objects (model was also trained on those objects): 91.5s
prediction time for 14000 objects (model was also trained on those objects): 141.84s

让预测在线性时间内运行是微不足道的（尽管在这里它可能接近线性），而且通常比训练快得多。那么这里发生了什么？

score 2 · Accepted Answer

2

您确定在预测时间的测量中不包括训练时间吗？你有时间的代码片段吗？

于 2013-03-29T14:03:56.227 回答

python - scikit-learn (SVMLIB) 中奇怪的 SVM 预测性能

1 回答 1

Related

Reference