1

我已将 pandas 面板转换为 xarray,但无法像使用 pandas 面板那样轻松地添加新项目、长轴和短轴。代码如下:

import numpy as np

import pandas as pd

import xarray as xr


panel = pd.Panel(np.random.randn(3, 4, 5), items=['one', 'two', 'three'], 
                 major_axis=pd.date_range('1/1/2000', periods=4),
                 minor_axis=['a', 'b', 'c', 'd','e'])

例如,如果我想添加一个新项目,我可以:

panel.four=pd.DataFrame(np.ones((4,5)),index=pd.date_range('1/1/2000', periods=4), columns=['a', 'b', 'c', 'd','e'])

panel.four

            a   b   c   d   e
2000-01-01  1.0 1.0 1.0 1.0 1.0

2000-01-02  1.0 1.0 1.0 1.0 1.0

2000-01-03  1.0 1.0 1.0 1.0 1.0

2000-01-04  1.0 1.0 1.0 1.0 1.0

我在增加 xarray 中的项目、长轴/短轴时遇到困难

px=panel.to_xarray()

#px gives me
<xarray.DataArray (items: 3, major_axis: 5, minor_axis: 4)>

array([[[-0.440081, -0.888226,  0.158702,  2.107577],
        [ 0.917835, -0.174557,  0.501626,  0.116761],
        [ 0.406988,  1.95184 , -1.345948,  2.960774],
        [-1.905529,  0.25793 ,  0.076162,  1.954012],
        [ 0.499675,  1.87567 , -1.698771, -1.143766]],


       [[ 0.070269, -1.151737, -0.344155, -0.506383],
        [-2.199357, -0.040909,  0.491984, -0.333431],
        [-0.113155, -0.668475,  2.366683, -0.421863],
        [-0.567336, -0.302224,  1.638386, -0.038545],
        [ 0.55067 , -0.409266, -0.27916 , -0.942144]],


       [[ 1.269171, -0.151471, -0.664072,  0.269168],
        [-0.486492,  0.59632 , -0.191977,  0.22537 ],
        [ 0.069231, -0.345793, -0.450797, -2.982   ],
        [-0.42338 , -0.849736,  0.965738, -0.544596],
        [-1.455378, -0.256441, -1.204572, -0.347749]]])

Coordinates:

  * items       (items) object 'one' 'two' 'three'

  * major_axis  (major_axis) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 ...

  * minor_axis  (minor_axis) object 'a' 'b' 'c' 'd'


#how should I add a fourth item, increase/delete major axis, minor axis?
4

2 回答 2

1

xarray.DataArray内部基于单个 NumPy 数组,因此无法有效地调整大小或附加到。您最好的选择是使用xarray.concat.

如果您想向 a 添加项目,您可能正在寻找的数据结构pd.Panelxarray.Dataset. 这些最容易从相当于面板的多索引 DataFrame 构建:

# First, make a DataFrame with a MultiIndex
>>> df = panel.to_frame()

>>> df.head()
                       one       two     three
major      minor
2000-01-01 a      0.278958  0.676034 -1.544726
           b     -0.918150 -2.707339 -0.552987
           c      0.023479  0.175528 -0.817556
           d      1.798001 -0.142016  1.390834
           e      0.256575  0.265369 -1.829766

# Now, convert the DataFrame with a MultiIndex to xarray
>>> ds = df.to_xarray()

>>> ds
<xarray.Dataset>
Dimensions:  (major: 4, minor: 5)
Coordinates:
  * major    (major) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
  * minor    (minor) object 'a' 'b' 'c' 'd' 'e'
Data variables:
    one      (major, minor) float64 0.279 -0.9182 0.02348 1.798 0.2566 2.41 ...
    two      (major, minor) float64 0.676 -2.707 0.1755 -0.142 0.2654 ...
    three    (major, minor) float64 -1.545 -0.553 -0.8176 1.391 -1.83 ...

# You can assign a DataFrame if it has the right column/index names
>>> ds['four'] = pd.DataFrame(np.ones((4,5)),
...                           index=pd.date_range('1/1/2000', periods=4, name='major'),
...                           columns=pd.Index(['a', 'b', 'c', 'd', 'e'], name='minor'))

# or just pass a tuple directly:
>>> ds['five'] = (('major', 'minor'), np.zeros((4, 5)))

>>> ds
<xarray.Dataset>
Dimensions:  (major: 4, minor: 5)
Coordinates:
  * major    (major) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
  * minor    (minor) object 'a' 'b' 'c' 'd' 'e'
Data variables:
    one      (major, minor) float64 0.279 -0.9182 0.02348 1.798 0.2566 2.41 ...
    two      (major, minor) float64 0.676 -2.707 0.1755 -0.142 0.2654 ...
    three    (major, minor) float64 -1.545 -0.553 -0.8176 1.391 -1.83 ...
    four     (major, minor) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ...
    five     (major, minor) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...

有关从 pandas.Panel 过渡到 xarray 的更多信息,请阅读 xarray 文档中的此部分:http: //xarray.pydata.org/en/stable/pandas.html#transitioning-from-pandas-panel-to-xarray

于 2017-09-30T16:18:17.467 回答
1

xarray 分配不像 pandas 面板那么优雅。假设我们要在上面的数据数组中添加第四项。下面是它的工作原理:

four=xr.DataArray(np.ones((1,4,5)), coords=[['four'],pd.date_range('1/1/2000', periods=4),['a', 'b', 'c', 'd','e']], 
                  dims=['items','major_axis','minor_axis'])

pxc=xr.concat([px,four],dim='items')

无论操作是在项目上还是在主轴/次轴上,都遵循类似的逻辑。用于删除使用

pxc.drop(['four'], dim='items')
于 2017-09-10T13:29:18.457 回答