scikit-learn - 在三维空间中分别绘制从 sklearn.decomposition.PCA 派生的 PCA 分量

Question

对于我的项目，我使用三维 MRI 数据，其中第四维代表不同的主题（为此我使用包nilearn）。我sklearn.decomposition.PCA用来从我的数据中提取给定数量的主成分。现在我想在大脑图像上分别绘制组件，也就是说，我想用不同颜色显示我提取的组件（在本例中为 2）的大脑图像。

这是使用 OASIS 数据集的示例代码，可以通过nilearn API下载：

掩蔽 using nilearn.input_data.NiftiMasker，它将我的 4 维数据转换为 2 维数组（n_subjects x n_voxels）。
使用标准化数据矩阵StandardScaler
使用以下方式运行 PCA sklearn.decomposition.PCA：

## set workspace
import numpy as np

from nilearn.datasets import fetch_oasis_vbm
from nilearn.input_data import NiftiMasker
from nilearn.image import index_img

from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline

from nilearn import plotting

## Load Data  #################################################################

# take only first 30 subjects as example
oasis_dataset = fetch_oasis_vbm(n_subjects=30)
imgs = np.array(oasis_dataset['gray_matter_maps'])

## PIPELINE ###################################################################

# create a random number generator
rng = np.random.RandomState(42)

# Convert Images to 2D Data Array 
niftimasker = NiftiMasker(mask_strategy='template')

# z-standardize images
scaler = StandardScaler()

# Extract 2 Components
pca = PCA(n_components=2,
          svd_solver='full',
          random_state=rng)

# create pipeline
pipe = Pipeline([('niftimasker',niftimasker),
                 ('scaler',scaler),
                 ('pca',pca)])

# call fit_transform on pipeline
X = pipe.fit_transform(imgs)

据我了解，我在运行 PCA 后获得的是 PCA 负载吗？不幸的是，我不明白如何从这得到两个图像，每个图像都包含一个 PCA 组件。

score 0 · Accepted Answer

要将数据恢复为图像格式，您需要执行 NiftiMasker.inverse_transform()。为此，您需要保留体素空间中的尺寸。

因此，管道现在的工作方式是在体素空间上使用降维。以防万一您想减少主题空间的维度，您可以更改以下内容：

pipe = Pipeline([('niftimasker',niftimasker),
             ('scaler',scaler),
#                  ('pca',pca)
            ])

X = pipe.fit_transform(imgs)
X_reduced = pca.fit_transform(X.T).T

然后您将应用逆变换，如下所示：

component_image = niftimasker.inverse_transform(X_reduced)

然后，要获取每个单独的主题组件图像，您将使用来自 nilearn.image 的 index_image。例如，这是第一个主题组件的图像：

component1_image = index_img(component_image,0)

但是，我认为您对减少体素空间的维度感兴趣。因此，为了保留逆变换的体素维度，您需要获取在 PCA 降维中选择的每个体素特征的索引。让您的管道保持原来的方式，并执行以下操作：

X = pipe.fit_transform(imgs)

components = pca.components_
#In your case 2, but replace range(2) with range(n_components)
most_important = [np.abs(components[i]).argmax() for i in range(2)]

然后用 x 主题和 y 体素平铺 nan 数组：（在您的情况下为 30 x 229007）

comp1, comp2 = np.tile(np.nan, [30,229007]), np.tile(np.nan, [30,229007])
for x,y in enumerate(X):
    comp1[x,most_important[0]] = y[0]
    comp2[x,most_important[1]] = y[1]

然后对每个组件应用逆变换：

component1_image = niftimasker.inverse_transform(comp1)
component2_image = niftimasker.inverse_transform(comp2)

您现在将有 2 张图像，每张图像有 30 个主题和 1 个代表所选组件的有效体素值。如何聚合 30 个主题的组件体素取决于您，在这种情况下，我将使用 nilearn.image 中的平均图像函数：

mean_component1_image = mean_img(component1_image)
mean_component2_image = mean_img(component2_image)

最后，在这两种情况下绘制各自的图像。在体素缩减版本中，您将看到 X 维度（第二张图）中的两个图像有微小变化，但几乎没有 Y 和 Z。我正在使用来自 nilearn.plotting 的 plot_glass_brain：

plotting.plot_glass_brain(mean_component1_image)
plotting.plot_glass_brain(mean_component2_image)

要使用叠加层，请调整颜色图以使其更易于可视化，其他绘图选项请参阅此和其他 nilearn 绘图指南：

https://nilearn.github.io/plotting/index.html#different-display-modes

如果您还有其他问题，请告诉我。

scikit-learn - 在三维空间中分别绘制从 sklearn.decomposition.PCA 派生的 PCA 分量

1 回答 1

Related

Reference