如果您只阅读基本数组和结构,请参阅 vikrantt在类似帖子中的回答。但是,如果您使用的是 Matlab ,那么恕我直言,最好的解决方案是完全避免该选项。table
save
我创建了一个简单的辅助函数来将 Matlab 转换table
为标准的 hdf5 文件,并在 Python 中创建了另一个辅助函数来将数据提取到 PandasDataFrame
中。
Matlab 辅助函数
function table_to_hdf5(T, path, group)
%TABLE_TO_HDF5 Save a Matlab table in an hdf5 file format
%
% TABLE_TO_HDF5(T) Saves the table T to the HDF5 file inputname.h5 at the root ('/')
% group, where inputname is the name of the input argument for T
%
% TABLE_TO_HDF5(T, path) Saves the table T to the HDF5 file specified by path at the
% root ('/') group.
%
% TABLE_TO_HDF5(T, path, group) Saves the table T to the HDF5 file specified by path
% at the group specified by group.
%
%%%
if nargin < 2
path = [inputname(1),'.h5']; % default file name to input argument
end
if nargin < 3
group = ''; % We will prepend '/' later, so this is effectively root
end
for field = T.Properties.VariableNames
% Prepare to write
field = field{:};
dataset_name = [group '/' field];
data = T.(field);
if ischar(data) || isstring(data)
warning('String columns not supported. Skipping...')
continue
end
% Write the data
h5create(path, dataset_name, size(data))
h5write(path, dataset_name, data)
end
end
Python 辅助函数
import pandas as pd
import h5py
def h5_to_df(path, group = '/'):
"""
Load an hdf5 file into a pandas DataFrame
"""
df = pd.DataFrame()
with h5py.File(path, 'r') as f:
data = f[group]
for k,v in data.items():
if v.shape[0] > 1: # Multiple column field
for i in range(v.shape[0]):
k_new = f'{k}_{i}'
df[k_new] = v[i]
else:
df[k] = v[0]
return df
重要笔记
- 这仅适用于数字数据。如果您知道如何添加字符串数据,请发表评论。
- 如果该文件尚不存在,这将创建该文件。
- 如果文件中已经存在数据,这将崩溃。您将希望在您认为合适的情况下包含处理这些情况的逻辑。