matlab - 在以下情况下如何使用 Matlab 的 princomp 函数？

Question

我有 10 张图片（18x18）。我将这些图像保存在一个名为的数组images[324][10]中，其中数字 324 代表图像的像素数量，数字 10 代表我拥有的图像总数。

我想将这些图像用于神经元网络，但是 324 作为输入是一个很大的数字，因此我想减少这个数字但保留尽可能多的信息。

我听说你可以用princomp实现 PCA 的函数来做到这一点。

问题是我还没有找到任何关于如何使用这个功能的例子，特别是对于我来说。

如果我跑

[COEFF, SCORE, latent] = princomp(images);

它运行良好，但我怎样才能得到数组newimages[number_of_desired_features][10]？

score 5 · Accepted Answer

PCA 在这里可能是一个正确的选择（但不是唯一的选择）。虽然，您应该知道 PCA 不会自动减少输入数据特征的数量。我建议您阅读本教程：http ://arxiv.org/pdf/1404.1100v1.pdf - 这是我用来理解 PCA 的教程，它对初学者非常有用。

回到你的问题。图像是 324 维空间中的向量。在这个空间中，第一个基向量是一个在左上角有一个白色像素的基向量，下一个是下一个像素为白色的，其他的都是黑色的——依此类推。它可能不是表示该图像数据的最佳基向量集。PCA 计算新的基向量（COEFF 矩阵 - 表示为旧向量空间中的值的新向量）和新的图像向量值（SCORE 矩阵）。那时您根本没有丢失任何数据（没有减少功能数量）。但是，您可以停止使用一些新的基向量，因为它们可能与噪声有关，而不是与数据本身有关。这一切都在教程中详细描述。

images = rand(10,324);
[COEFF, SCORE] = princomp(images);
reconstructed_images = SCORE / COEFF + repmat(mean(images,1), 10, 1);
images - reconstructed_images
%as you see there are almost only zeros - the non-zero values are effects of small numerical errors
%its possible because you are only switching between the sets of base vectors used to represent the data
for i=100:324
    SCORE(:,i) = zeros(10,1);
end
%we remove the features 100 to 324, leaving only first 99
%obviously, you could take only the non-zero part of the matrix and use it
%somewhere else, like for your neural network
reconstructed_images_with_reduced_features = SCORE / COEFF + repmat(mean(images,1), 10, 1);
images - reconstructed_images_with_reduced_features
%there are less features, but reconstruction is still pretty good

matlab - 在以下情况下如何使用 Matlab 的 princomp 函数？

1 回答 1

Related

Reference