python - sklearn MDS 使我的内核崩溃？

Question

我有一个包含连续数据的 50,000 x 15 numpy 矩阵。我想使用 MDS（多维缩放）缩小到 2 个分量，以便在二维向量空间中可视化数据。出于某种原因，每当我在我的数据上运行 MDS 时，我的内存和 CPU 百分比都会增加很多，并且我的内核崩溃了，告诉我需要重新启动。有没有人遇到过类似的问题或知道是什么原因造成的？

我使用的是 MacBook Air、125GB SSD、4GB RAM，我的开发环境是 Spyder IDE。

谢谢

score 3 · Accepted Answer

I recommend running MDS with a 5% random sample. Looking through the scikit documentation, it seems most of the algorithms in the Manifold learning module have complexity of O(n^2). There no specific documentation for MDS, but comparing run times I can only assume MDS is n^2 or worse. Too much data, inefficient algorithm, small RAM = kernel crash

http://scikit-learn.org/stable/modules/manifold.html#manifold

score 3 · Accepted Answer

我们当前的 MDS 实现是基于过于通用的 smacof 方法。在许多情况下，PCA / SVD 可能要快得多。这是作为拉取请求计划的。

同时你可以直接使用sklearn.decomposition.RandomizedPCA而不是MDS类。

python - sklearn MDS 使我的内核崩溃？

2 回答 2

Related

Reference