3

我想选择纬度/经度范围内的所有网格单元格,对于每个网格单元格,将其导出为日期框,然后导出到 csv 文件(即df.to_csv)。我的数据集如下。我可以xr.where(...)用来屏蔽输入之外的网格单元,但不确定如何循环遍历未屏蔽的剩余网格。或者,我尝试使用这些xr.sel函数,但它们似乎不接受像ds.sel(gridlat_0>45). xr.sel_points(...)也可以工作,但我无法弄清楚在我的情况下使用的索引器的正确语法。提前谢谢你的帮助。

<xarray.Dataset>
Dimensions:    (time: 48, xgrid_0: 685, ygrid_0: 485)
Coordinates:
    gridlat_0  (ygrid_0, xgrid_0) float32 44.6896 44.6956 44.7015 44.7075 ...
  * ygrid_0    (ygrid_0) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ...
  * xgrid_0    (xgrid_0) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ...
  * time       (time) datetime64[ns] 2016-07-28T01:00:00 2016-07-28T02:00:00 ...
    gridlon_0  (ygrid_0, xgrid_0) float32 -129.906 -129.879 -129.851 ...
Data variables:
    u          (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ...
    gridrot_0  (time, ygrid_0, xgrid_0) float32 nan nan nan nan nan nan nan ...
    Qli        (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ...
    Qsi        (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ...
    p          (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ...
    rh         (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ...
    press      (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ...
    t          (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ...
    vw_dir     (time, ygrid_0, xgrid_0) float64 nan nan nan nan nan nan nan ...
4

1 回答 1

2

最简单的方法可能是遍历每个网格点,如下所示:

# (optionally) create a grid dataset so we don't need to pull out all
# the data from the main dataset before looking at each point
grid = ds[['gridlat_0', 'gridlon_0']]

for i in range(ds.coords['xgrid_0'].size):
    for j in range(ds.coords['ygrid_0'].size):
        sub_grid = grid.isel(xgrid_0=i, ygrid_0=j)
        if is_valid(sub_grid.gridlat_0, sub_grid.gridlon_0):
            sub_ds = ds.isel(xgrid_0=i, ygrid_0=j)
            sub_ds.to_dataframe().to_csv(...)

即使使用 685x485,也只需要几秒钟即可遍历每个点。

事先使用(在下一个 xarray 版本中可用,本周晚些时候发布)进行预过滤ds = ds.where(..., drop=True)可以显着加快速度,但您仍然会遇到可能无法在正交轴上表示选定网格的问题。

最后一个选项,可能是最简洁的,stack用于将数据集转换为 2D。然后您可以沿新'space'维度使用标准选择和 groupby 操作:

ds_stacked = ds.stack(space=['xgrid_0', 'ygrid_0'])
ds_filtered = ds_stacked.sel(space=(ds_stacked.gridlat_0 > 45))
for _, ds_one_place in ds_filtered.groupby('space'):
    ds_one_place.to_dataframe().to_csv(...)
于 2016-08-01T22:29:04.913 回答