amazon-s3 - 通过 s3fs 从 S3 读取文件时可以使用 xr.open_mfdataset 吗？

Question

我正在尝试使用 s3fsxr.open_mfdataset从 S3 存储桶一次读取多个 netcdf 文件。这可能吗？

尝试了以下，它适用于xr.open_dataset单个文件，但不适用于多个文件：

import s3fs
import xarray as xr

fs = s3fs.S3FileSystem(anon=False)
s3path = 's3://my-bucket/wind_data*'
store = s3fs.S3Map(root=s3path, s3=s3fs.S3FileSystem(), check=False)

data = xr.open_mfdataset(store, combine='by_coords')

score 2 · Accepted Answer

我不确定到底是什么S3Map；s3fs 的文档在这方面并不具体。

但是，我能够在 Jupyter 环境中使用S3FileSystem.glob()和S3FileSystem.open()

这是一个代码示例：

import s3fs
import xarray as xr


s3 = s3fs.S3FileSystem(anon=False)

# This generates a list of strings with filenames
s3path = 's3://your-bucket/your-folder/file_prefix*'
remote_files = s3.glob(s3path)

# Iterate through remote_files to create a fileset
fileset = [s3.open(file) for file in remote_files]

# This works
data = xr.open_mfdataset(fileset, combine='by_coords')

amazon-s3 - 通过 s3fs 从 S3 读取文件时可以使用 xr.open_mfdataset 吗？

1 回答 1

Related

Reference