matlab - 在 MATLAB 中存储 16 × (2^20) 矩阵的最佳方法是什么？

Question

我正在考虑将数据写入文件。有没有人有如何将大量数据写入文件的示例？

编辑：矩阵中的大多数元素都是零，其他元素是uint32. 正如@Jonas 建议的那样，我猜是最简单的save()并且会起作用。load()

score 6 · Accepted Answer

我想没有人看过关于零的编辑:)

如果它们大多为零，则应将矩阵转换为其稀疏表示，然后保存。您可以使用稀疏函数来做到这一点。

代码

z = zeros(10000,10000);
z(123,456) = 1;
whos z
z = sparse(z);
whos z

输出

Name          Size                   Bytes  Class     Attributes

  z         10000x10000            800000000  double  

Name          Size               Bytes  Class     Attributes

  z         10000x10000            40016  double    sparse

我不认为稀疏实现旨在处理uint32.

score 3 · Accepted Answer

如果您关心保持数据文件的大小尽可能小，这里有一些建议：

将数据写入二进制文件（即使用FWRITE）而不是文本文件（即使用FPRINTF）。
如果您的数据包含所有整数值，请将其转换为或保存为有符号或无符号整数类型，而不是MATLAB 使用的默认双精度类型。
如果您的数据包含浮点值，但您不需要默认双精度类型的范围或分辨率，请将其转换为或保存为单精度类型。
如果您的数据足够稀疏（即矩阵中的零比非零多），那么您可以使用FIND函数获取非零值的行和列索引，然后将它们保存到您的文件中.

这里有几个例子来说明：

data = double(rand(16,2^20) <= 0.00001);  %# A large but very sparse matrix

%# Writing the values as type double:
fid = fopen('data_double.dat','w');  %# Open the file
fwrite(fid,size(data),'uint32');     %# Write the matrix size (2 values)
fwrite(fid,data,'double');           %# Write the data as type double
fclose(fid);                         %# Close the file

%# Writing the values as type uint8:
fid = fopen('data_uint8.dat','w');  %# Open the file
fwrite(fid,size(data),'uint32');    %# Write the matrix size (2 values)
fwrite(fid,data,'uint8');           %# Write the data as type uint8
fclose(fid);                        %# Close the file

%# Writing out only the non-zero values:
[rowIndex,columnIndex,values] = find(data);  %# Get the row and column indices
                                             %#   and the non-zero values
fid = fopen('data_sparse.dat','w');  %# Open the file
fwrite(fid,numel(values),'uint32');  %# Write the length of the vectors (1 value)
fwrite(fid,rowIndex,'uint32');       %# Write the row indices
fwrite(fid,columnIndex,'uint32');    %# Write the column indices
fwrite(fid,values,'uint8');          %# Write the non-zero values
fclose(fid);                         %# Close the file

上面创建的文件在大小上会有很大的不同。该文件'data_double.dat'大约为 131,073 KB，'data_uint8.dat'大约为 16,385 KB，并且'data_sparse.dat'小于 2 KB。

请注意，我还将 data\vector 大小写入文件，以便可以读回数据（使用FREAD）并正确重塑。另请注意，如果我没有为FWRITE'double'提供or'uint8'参数，MATLAB 会足够聪明地发现它不需要使用默认的双精度，并且只使用 8 位来写出数据值（因为它们是所有 0 和 1)。

score 2 · Accepted Answer

数据是如何产生的？您需要如何访问数据？

如果我计算正确，则变量小于 200MB，如果它都是双倍的。因此，如果您只需要从 Matlab 访问它，您可以轻松地将其保存并加载为单个 .mat 文件。

%# create data
data = zeros(16,2^20);

%# save data
save('myFile.mat','data');

%# clear data to test everything works
clear data

%# load data
load('myFile.mat')

matlab - 在 MATLAB 中存储 16 × (2^20) 矩阵的最佳方法是什么？

3 回答 3

代码

输出

Related

Reference