python - 将 3 维 xr.DataArray （Xarray）展平/拉开/折叠成沿轴的 2 维？

Question

我有一个数据集，我在其中存储不同类/子类型的副本（不知道该怎么称呼它），然后是每个类/子类型的属性。本质上，有 5 个子类型/类，每个子类型/类有 4 个重复，以及 100 个被测量的属性。

是否有类似np.ravel或np.flatten可以使用合并二维的方法Xarray？

在此，我想合并暗淡subtype，replicates所以我有一个 2D 数组（或pd.DataFrame带有attributes vs. subtype/replicates.

它不需要具有“coord_1 | coord_2”或任何格式。如果它保留原始坐标名称将很有用。也许有类似的东西groupby可以做到这一点？Groupby总是让我感到困惑，所以如果它是原生的东西xarray，那就太棒了。

import xarray as xr
import numpy as np

# Set up xr.DataArray
dims = (5,4,100)
DA_data = xr.DataArray(np.random.random(dims), dims=["subtype","replicates","attributes"])
DA_data.coords["subtype"] = ["subtype_%d"%_ for _ in range(dims[0])]
DA_data.coords["replicates"] = ["rep_%d"%_ for _ in range(dims[1])]
DA_data.coords["attributes"] = ["attr_%d"%_ for _ in range(dims[2])]

# DA_data.coords
# Coordinates:
#   * subtype     (subtype) <U9 'subtype_0' 'subtype_1' 'subtype_2' ...
#   * replicates  (replicates) <U5 'rep_0' 'rep_1' 'rep_2' 'rep_3'
#   * attributes  (attributes) <U7 'attr_0' 'attr_1' 'attr_2' 'attr_3' ...
# DA_data.dims
# ('subtype', 'replicates', 'attributes')

# Naive way to collapse the replicate dimension into the subtype dimension
desired_columns = list()
for subtype in DA_data.coords["subtype"]:
    for replicate in DA_data.coords["replicates"]:
        desired_columns.append(str(subtype.values) + "|" + str(replicate.values))
desired_columns
# ['subtype_0|rep_0',
#  'subtype_0|rep_1',
#  'subtype_0|rep_2',
#  'subtype_0|rep_3',
#  'subtype_1|rep_0',
#  'subtype_1|rep_1',
#  'subtype_1|rep_2',
#  'subtype_1|rep_3',
#  'subtype_2|rep_0',
#  'subtype_2|rep_1',
#  'subtype_2|rep_2',
#  'subtype_2|rep_3',
#  'subtype_3|rep_0',
#  'subtype_3|rep_1',
#  'subtype_3|rep_2',
#  'subtype_3|rep_3',
#  'subtype_4|rep_0',
#  'subtype_4|rep_1',
#  'subtype_4|rep_2',
#  'subtype_4|rep_3']

score 5 · Accepted Answer

是的，这正是.stack它的用途：

In [33]: stacked = DA_data.stack(desired=['subtype', 'replicates'])

In [34]: stacked
Out[34]:
<xarray.DataArray (attributes: 100, desired: 20)>
array([[ 0.54020268,  0.14914837,  0.83398895, ...,  0.25986503,
         0.62520466,  0.08617668],
       [ 0.47021735,  0.10627027,  0.66666478, ...,  0.84392176,
         0.64461418,  0.4444864 ],
       [ 0.4065543 ,  0.59817851,  0.65033094, ...,  0.01747058,
         0.94414244,  0.31467342],
       ...,
       [ 0.23724934,  0.61742922,  0.97563316, ...,  0.62966631,
         0.89513904,  0.20139552],
       [ 0.21157447,  0.43868899,  0.77488211, ...,  0.98285015,
         0.24367352,  0.8061804 ],
       [ 0.21518079,  0.234854  ,  0.18294781, ...,  0.64679141,
         0.49678393,  0.32215219]])
Coordinates:
  * attributes  (attributes) |S7 'attr_0' 'attr_1' 'attr_2' 'attr_3' ...
  * desired     (desired) object ('subtype_0', 'rep_0') ...

生成的堆叠坐标是 a pandas.MultiIndex，其值由元组给出：

In [35]: stacked['desired'].values
Out[35]:
array([('subtype_0', 'rep_0'), ('subtype_0', 'rep_1'),
       ('subtype_0', 'rep_2'), ('subtype_0', 'rep_3'),
       ('subtype_1', 'rep_0'), ('subtype_1', 'rep_1'),
       ('subtype_1', 'rep_2'), ('subtype_1', 'rep_3'),
       ('subtype_2', 'rep_0'), ('subtype_2', 'rep_1'),
       ('subtype_2', 'rep_2'), ('subtype_2', 'rep_3'),
       ('subtype_3', 'rep_0'), ('subtype_3', 'rep_1'),
       ('subtype_3', 'rep_2'), ('subtype_3', 'rep_3'),
       ('subtype_4', 'rep_0'), ('subtype_4', 'rep_1'),
       ('subtype_4', 'rep_2'), ('subtype_4', 'rep_3')], dtype=object)

python - 将 3 维 xr.DataArray （Xarray）展平/拉开/折叠成沿轴的 2 维？

1 回答 1

Related

Reference