根据这个问题,我应该尝试使用 Preallocation is Matlab。
现在我有一种情况,我无法计算要预分配的矩阵的确切大小。我能猜出大小。
假设矩阵的实际大小是 100,但我不知道。嘘
哪种方案更有效:
- 我应该奢侈吗?我猜是一个大矩阵,最后我删除了多余的行。
- 我应该小气吗?我猜尺寸很小,如果错了,我会添加新行。
谢谢。
根据这个问题,我应该尝试使用 Preallocation is Matlab。
现在我有一种情况,我无法计算要预分配的矩阵的确切大小。我能猜出大小。
假设矩阵的实际大小是 100,但我不知道。嘘
哪种方案更有效:
谢谢。
在我看来,答案比@natan 描述的要复杂一些。我认为他的回答没有考虑两个因素:
可能需要的内存副本:当您低估矩阵大小并重新分配它时,应将其所有旧值复制到新分配的位置。
内存块的连续性:有时 Matlab 能够在旧矩阵的末尾连续分配新内存。原则上,在这种情况下,不需要将旧值复制到新位置 - 因为它与旧值相同,只是更大。但是,如果将行添加到 2D 矩阵,即使在这种情况下也需要复制内容,因为 Matlab 在内存中以行优先方式存储矩阵。
所以,我的回答是这样的:
首先,您究竟不知道矩阵的大小:如果您知道一维 - 将其设为矩阵的行数,因此您只需要更改列数。这样,如果您已经存储的数据需要被复制,它将被复制到更大的块中。
其次,这取决于您拥有多少可用 RAM。如果您不缺 RAM,那么高估也没有错。
但是,如果您缺少 RAM,请考虑低估。但是当你重新分配时,在每次迭代时增加新的块大小:
BASIC_SIZE = X; % first estimate
NEW_SIZE = Y; % if need more, add this amount
factor = 2;
arr = zeros( m, BASIC_SIZE ); % first allocation, assuming we know number of rows
while someCondition
% process arr ...
if needMoreCols
arr(:, size(arr,2) + (1:NEW_SIZE) ) = 0; % allocate another block
NEW_SIZE = round(NEW_SIZE * factor); % it seems like we are off in estimation, try larger chunk next time factor should be > 1
end
end
arr = arr(:, 1:actualNumOfCols ); % resize to actual size, discard unnecessary columns
+1 for the interesting question.
EDITED Answer: From a little experimental study at first it seems better to add rows later, but it now seems more efficient overesrimated and preallocate again when you have the info of the correct size . I started with matrix size 3000 and guessed an error of 10% in the size estimation, see below:
clear all
clc
guess_size=3000;
m=zeros(guess_size);
%1. oops overesrimated, take out rows
tic
m(end-300:end,:)=[];
toc
%1b. oops overesrimated, preallocate again
tic
m=zeros(guess_size-300,guess_size);
toc
%2. oops overesrimated, take out cols
m=zeros(guess_size);
tic
m(:,end-300:end)=[];
toc
%2b. oops overesrimated, preallocate again
m=zeros(guess_size);
tic
m=zeros(guess_size,guess_size-300);
toc
%3. oops underesrimated, add rows
m=zeros(guess_size);
tic
m=zeros(guess_size+300,guess_size);
toc
%4. oops underesrimated, add cols
m=zeros(guess_size);
tic
m=zeros(guess_size,guess_size+300);
toc
Elapsed time is 0.041893 seconds.
Elapsed time is 0.026925 seconds.
Elapsed time is 0.041818 seconds.
Elapsed time is 0.023425 seconds.
Elapsed time is 0.027523 seconds.
Elapsed time is 0.029509 seconds.
Option 2b and 1b are slightly faster than underestimating, so if you can, better overestimate and then preallocate again. It is never efficient to delete rows from an array. Also adding columns seems slightly more efficient, but this is just a quick and dirty job. See @Shai detailed answer for the inner workings...
除了其他教育性答案外,简短的简短版本:有三种情况: