如果只有 1 和 0,并且填充因子不是特别小,最好的办法是将数字存储为二进制数;如果您需要原始尺寸,请单独保存。我扩展了代码,更清楚地显示了中间步骤,还显示了不同数组所需的存储量。注意 - 我将您的数据重新调整为 13x10 数组,因为它显示得更好。
number_1 = [0 0 0 0 0 0 0 0 0 0 ...
0 0 1 1 1 1 0 0 0 0 ...
0 1 1 0 1 1 0 0 0 0 ...
0 1 1 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 1 1 1 1 1 1 1 1 0 ...
0 0 0 0 0 0 0 0 0 0];
n1matrix = reshape(number_1, 10, [])'; % make it nicer to display;
% transpose because data is stored column-major (row index changes fastest).
disp('the original data in 13 rows of 10:');
disp(n1matrix);
% create a matrix with 8 rows and enough columns
n1 = numel(number_1);
nc = ceil(n1/8); % "enough columns"
npad = zeros(8, nc);
npad(1:n1) = number_1; % fill the first n1 elements: the rest is zero
binVec = 2.^(7-(0:7)); % 128, 64, 32, 16, 8, 4, 2, 1 ... powers of two
compressed1 = uint8(binVec * npad); % 128 * bit 1 + 64 * bit 2 + 32 * bit 3...
% showing what we did...
disp('Organizing into groups of 8, and calculated their decimal representation:')
for ii = 1:nc
fprintf(1,'%d ', npad(:, ii));
fprintf(1, '= %d\n', compressed1(ii));
end
% now the inverse operation: using dec2bin to turn decimals into binary
% this function returns strings, so some further processing is needed
% original code used de2bi (no typo) but that requires a communications toolbox
% like this the code is more portable
decompressed = dec2bin(compressed1);
disp('the string representation of the numbers recovered:');
disp(decompressed); % this looks a lot like the data in groups of 8, but it's a string
% now we turn them back into the original array
% remember it is a string right now, and the values are stored
% in column-major order so we need to transpose
recovered = ('1'==decompressed'); % all '1' characters become logical 1
display(recovered);
% alternative solution #1: use logical array
compressed2 = (n1matrix==1);
display(compressed2);
recovered = double(compressed2); % looks just the same...
% other suggestions 1: use find
compressed3 = find(n1matrix); % fewer elements, but each element is 8 bytes
compressed3b = uint8(compressed); % if you know you have fewer than 256 elements
% or use `sparse`
compressed4 = sparse(n1matrix);
% or use logical sparse:
compressed5 = sparse((n1matrix==1));
whos number_1 comp*
the original data in 13 rows of 10:
0 0 0 0 0 0 0 0 0 0
0 0 1 1 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0
Organizing into groups of 8, and their decimal representation:
0 0 0 0 0 0 0 0 = 0
0 0 0 0 1 1 1 1 = 15
0 0 0 0 0 1 1 0 = 6
1 1 0 0 0 0 0 1 = 193
1 0 1 1 0 0 0 0 = 176
0 0 0 0 1 1 0 0 = 12
0 0 0 0 0 0 1 1 = 3
0 0 0 0 0 0 0 0 = 0
1 1 0 0 0 0 0 0 = 192
0 0 1 1 0 0 0 0 = 48
0 0 0 0 1 1 0 0 = 12
0 0 0 0 0 0 1 1 = 3
0 0 0 0 0 0 0 0 = 0
1 1 0 0 0 0 0 1 = 193
1 1 1 1 1 1 1 0 = 254
0 0 0 0 0 0 0 0 = 0
0 0 0 0 0 0 0 0 = 0
the string representation of the numbers recovered:
00000000
00001111
00000110
11000001
10110000
00001100
00000011
00000000
11000000
00110000
00001100
00000011
00000000
11000001
11111110
00000000
00000000
compressed2 =
0 0 0 0 0 0 0 0 0 0
0 0 1 1 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0
recovered =
0 0 0 0 0 0 0 0 0 0
0 0 1 1 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0
Name Size Bytes Class Attributes
compressed1 1x17 17 uint8
compressed2 13x10 130 logical
compressed3 34x1 272 double
compressed3b 34x1 34 uint8
compressed4 13x10 632 double sparse
compressed5 13x10 394 logical sparse
number_1 1x130 1040 double
如您所见,原始数组占用 1040 个字节;压缩数组需要 17。你得到了几乎 64 倍的压缩(不完全是因为 132 不是 8 的倍数);只有非常稀疏的数据集才能通过其他方式更好地压缩。唯一接近(而且速度非常快)的是
compressed3b = uint8(find(number_1));
在 34 字节时,它绝对是小型数组(< 256 个元素)的竞争者。
注意 - 当您在 Matlab 中保存数据(使用save(fileName, 'variableName')
)时,会自动进行一些压缩。这导致了一个有趣且令人惊讶的结果。当您获取上述每个变量并使用 Matlab 将它们保存到文件save
时,以字节为单位的文件大小变为:
number_1 195
compressed1 202
compressed2 213
compressed3 219
compressed3b 222
compressed4 256
compressed5 252
另一方面,如果您自己创建一个二进制文件,使用
fid = fopen('myFile.bin', 'wb');
fwrite(fid, compressed1)
fclose(fid)
默认情况下会写入uint8
,因此文件大小为 130、17、130、34、34 - 不能以这种方式写入稀疏数组。它仍然显示具有最佳压缩的“复杂”压缩。