hdf5 - h5py 可以从内存中的字节数组加载文件吗？

Question

我的 python 代码正在接收一个字节数组，它代表 hdf5 文件的字节。

我想将此字节数组读取到内存中的 h5py 文件对象，而无需先将字节数组写入磁盘。这个页面说我可以打开一个内存映射文件，但这将是一个新的空文件。我想从字节数组到内存中的 hdf5 文件，使用它，丢弃它，而不是在任何时候写入磁盘。

可以用 h5py 做到这一点吗？（或者如果这是唯一的方法，则使用 C 使用 hdf5）

score 7 · Accepted Answer

您可以尝试使用Binary I/O创建一个 File 对象并通过 h5py 读取它：

f = io.BytesIO(YOUR_H5PY_STREAM)
h = h5py.File(f,'r')

score 2 · Accepted Answer

您可以使用io.BytesIO或tempfile创建 h5 对象，这在官方文档http://docs.h5py.org/en/stable/high/file.html#python-file-like-objects中显示。

File 的第一个参数可能是 Python 文件类对象，例如 io.BytesIO 或 tempfile.TemporaryFile 实例。这是一种创建临时 HDF5 文件的便捷方式，例如用于测试或通过网络发送。

临时文件.临时文件

>>> tf = tempfile.TemporaryFile()
>>> f = h5py.File(tf)

或 io.BytesIO

"""Create an HDF5 file in memory and retrieve the raw bytes

This could be used, for instance, in a server producing small HDF5
files on demand.
"""
import io
import h5py

bio = io.BytesIO()
with h5py.File(bio) as f:
    f['dataset'] = range(10)

data = bio.getvalue() # data is a regular Python bytes object.
print("Total size:", len(data))
print("First bytes:", data[:10])

score 0 · Accepted Answer

以下示例使用仍然可以读取和操作 H5 格式的表来代替 H5PY。

import urllib.request
import tables
url = 'https://s3.amazonaws.com/<your bucket>/data.hdf5'
response = urllib.request.urlopen(url) 
h5file = tables.open_file("data-sample.h5", driver="H5FD_CORE",
                          driver_core_image=response.read(),
                          driver_core_backing_store=0)

hdf5 - h5py 可以从内存中的字节数组加载文件吗？

3 回答 3

Related

Reference