5

我正在尝试使用 sklearn 将 GaussianMixture 拟合到一堆猫和狗的图片中。我提供了一个大小为 (50,30000) 的 numpy 数组,其中 50 个数据点(25 张猫和 25 张狗图片),30000 是将每张图片转换为 numpy 数组并将大小调整为 (100,100,3) 后的特征数。它抛出内存错误。在运行此代码之前,我有 4GB 的 RAM 和 70% 的使用。谁能建议我如何调试 sklearn 中的 GaussianMixture 拟合方法使用了多少内存。或者任何人都可以提供一些代码来批量适应它。

以下是代码

print(img_coll_cat_dog.shape)
print(img_coll_cat_dog.nbytes)
print(img_coll_cat_dog.itemsize)

结果:

(50, 30000)
12000000 bytes
8 

gmix = mixture.GaussianMixture(n_components=2, covariance_type='full')
gmix.fit(img_coll_cat_dog)

以下是我得到的错误。

MemoryError                               Traceback (most recent call last)
<ipython-input-32-c0370476a619> in <module>()
      1 gmix = mixture.GaussianMixture(n_components=2, covariance_type='full')
----> 2 gmix.fit(img_coll_cat_dog)

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/base.py in fit(self, X, y)
    205 
    206             if do_init:
--> 207                 self._initialize_parameters(X, random_state)
    208                 self.lower_bound_ = -np.infty
    209 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/base.py in _initialize_parameters(self, X, random_state)
    155                              % self.init_params)
    156 
--> 157         self._initialize(X, resp)
    158 
    159     @abstractmethod

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _initialize(self, X, resp)
    629 
    630         weights, means, covariances = _estimate_gaussian_parameters(
--> 631             X, resp, self.reg_covar, self.covariance_type)
    632         weights /= n_samples
    633 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _estimate_gaussian_parameters(X, resp, reg_covar, covariance_type)
    283                    "diag": _estimate_gaussian_covariances_diag,
    284                    "spherical": _estimate_gaussian_covariances_spherical
--> 285                    }[covariance_type](resp, X, nk, means, reg_covar)
    286     return nk, means, covariances
    287 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _estimate_gaussian_covariances_full(resp, X, nk, means, reg_covar)
    162     """
    163     n_components, n_features = means.shape
--> 164     covariances = np.empty((n_components, n_features, n_features))
    165     for k in range(n_components):
    166         diff = X - means[k]

MemoryError: 

任何帮助深表感谢。

4

1 回答 1

6

尝试设置 covariance_type='diag'

于 2018-02-23T08:27:50.630 回答