0

我在数组 <1x43 单元格> 中有一个大型数据集。数据量非常大,这些是一些单元格尺寸 - 5 个是 <1x327680 double>,11 个是 <1x1376256 double>

我正在尝试执行我有一个功能的重新采样操作。(功能代码如下所示)。我正在尝试从数组中取出整个单元格,执行重采样操作并将结果存储回相同的数组位置或不同的位置。

但是,我在第 19 行或 Resample 函数中收到以下错误 -

“错误使用零超出了程序允许的最大变量大小。重新采样错误(第 19 行)obj = zeros(t,1);

当我评论我们的第 19 行时,我遇到了内存不足的错误。

请问有没有更有效的方法来操作这么大的数据集?

谢谢你。

实际代码:

%% To load each ".dat" file for the 51 attributes to an array.

a = dir('*.dat');

for i = 1:length(a)
eval(['load ' a(i).name ' -ascii']);
end

attributes = length(a);

% Scan folder for number of ".dat" files
datfiles = dir('*.dat'); 

% Count Number of ".dat" files
numfiles = length(datfiles); 

% Read files in to MATLAB
for i = 1:1:numfiles
    A{i} = csvread(datfiles(i).name);
end

% Remove discarded variables
ind = [1 22 23 24 25 26 27 32]; % Variables to be removed.
A(ind) = [];

% Reshape all the data into columns - (n x 1) 
for i = 1:1:length(A)
    temp = A{1,i};
    [x,y] = size(temp);
    if x == 1 && y ~= 1
        temp = temp';
        A{1,i} = temp;
    end
end

% Retrieves the frequency data for the attributes from Excel spreadsheet
frequency = xlsread('C:\Users\aajwgc\Documents\MATLAB\Research Work\Data\testBig\frequency');

% Removing recorded frequency for discarded variables
frequency(ind) = [];

% Upsampling all the attributes to desired frequency
prompt = {'Frequency (Hz):'};
dlg_title = 'Enter desired output frequency for all attributes';
num_lines = 1;
def = {'50'};
answer= inputdlg(prompt,dlg_title,num_lines,def);
OutFreq = str2num(answer{1});

m = 1; 
n = length(frequency);
A_resampled = cell(m,n);
A_resampled(:) = {''};

for i = length(frequency);
    raw = cell2mat(A(1,i));
    temp= Resample(raw, frequency(i,:), OutFreq);
     A_resampled{i} = temp(i);
end

重采样功能:

function obj = Resample(InputData, InFreq, OutFreq, varargin)
%% Preliminary setup
% Allow for selective down-sizing by specifying type
type = 'mean'; %default to the mean/average

if size(varargin,2) > 0
    type = varargin{1};
end

% Determine the necessary resampling factor
factor = OutFreq / InFreq;

%% No refactoring required
if (factor == 1)
    obj = InputData;
%% Up-Sampling required
elseif (factor > 1)
    t = factor * numel(InputData(1:end));
    **obj = zeros(t,1); ----------------> Line 19 where I get the error message.**

    for i = 1:factor:t
        y = ((i-1) / factor) + 1;
        z = InputData(y);
        obj(i:i+factor) = z;
    end
%% Down-Sampling required
elseif (factor < 1)    
    t = numel(InputData(1:end));
    t = floor(t * factor);
    obj = zeros(t,1);
    factor = int32(1/factor);

    if  strcmp(type,'mean') %default is mean (process first)
        for i = 1:t
            y = (factor * (i-1)) + 1;
            obj(i) = mean(InputData(y:y+factor-1));
        end    
    elseif strcmp(type,'min')
        for i = 1:t
            y = (factor * (i-1)) + 1;
            obj(i) = min(InputData(y:y+factor-1));
        end 
    elseif strcmp(type,'max')
        for i = 1:t
            y = (factor * (i-1)) + 1;
            obj(i) = max(InputData(y:y+factor-1));
        end 
    elseif strcmp(type,'mode')
        for i = 1:t
            y = (factor * (i-1)) + 1;
            obj(i) = mode(InputData(y:y+factor-1));
        end 
    elseif strcmp(type,'sum')
        for i = 1:t
            y = (factor * (i-1)) + 1;
            obj(i) = sum(InputData(y:y+factor-1));
        end   
    elseif strcmp(type,'single')
        for i = 1:t
            y = (factor * (i-1)) + 1;
            obj(i) = InputData(y);
        end
    else
        obj = NaN;
    end
else
    obj = NaN;
end
4

1 回答 1

0

如果您有 DSP 系统工具箱,您可以使用例如 dsp.FIRInterpolator 系统对象(http://www.mathworks.co.uk/help/dsp/ref/dsp.firinterpolatorclass.html)并调用它的步骤( ) 功能重复,以避免一次处理所有数据。

顺便说一句,上/下采样(插值和抽取)是比您想象的更复杂的概念;在最一般的意义上,它们都需要某种形式的过滤来去除此类过程产生的伪影。

您可以自己设计这些滤波器并将您的信号与它们进行卷积,但是进行这种滤波器设计需要在信号处理方面打下坚实的基础。如果你想走这条路,我建议在没有参考文本的情况下从容易出错的地方拿起一本教科书。

于 2013-03-21T15:36:41.667 回答