2

假设我有一个数据集:

Jday = datenum('2009-01-01 00:00','yyyy-mm-dd HH:MM'):1/24:...
    datenum('2009-01-05 23:00','yyyy-mm-dd HH:MM');
DateV = datevec(Jday);
DateV(4,:) = [];
DateV(15,:) = [];
DateV(95,:) = [];

Dat = rand(length(Jday),1)

如何删除所有小于 24 次测量的日期。例如,在第一天只有 23 次测量,因此我需要删除一整天,我怎么能对所有阵列重复此操作?

4

2 回答 2

1

相当长的答案,但我认为它应该有用。我会使用containers.Map来做到这一点。可能有一种更快的方法,但也许现在这个方法会很好。

Jday = datenum('2009-01-01 00:00','yyyy-mm-dd HH:MM'):1/24:...
    datenum('2009-01-05 23:00','yyyy-mm-dd HH:MM');

DateV = datevec(Jday);
DateV(4,:) = [];
DateV(15,:) = [];
DateV(95,:) = [];


% create a map
dateMap = containers.Map();



% count measurements in each date (i.e. first three columns of DateV)
for rowi = 1:1:size(DateV,1)

    dateRow = DateV(rowi, :);
    dateStr = num2str(dateRow(1:3));

    if ~isKey(dateMap, dateStr)
        % initialize Map for a given date with 1 measurement (i.e. our
        % counter of measuremnts
        dateMap(dateStr)  = 1;
        continue;
    end
    % increment measurement counter for given date
    dateMap(dateStr)  = dateMap(dateStr) + 1;
end


% get the dates
dateStrSet = keys(dateMap);




for keyi = 1:numel(dateStrSet)

    dateStrCell = dateStrSet(keyi);  
    dateStr = dateStrCell{1};

    % get number of measurements in a given date
    numOfmeasurements = dateMap(dateStr);

    % if less then 24 do something about it, e.g. save the date
    % for later removal from DateV
    if numOfmeasurements < 24
        fprintf(1, 'This date has less than 24 measurement: %s\n', dateStr);
    end
end

结果是:

This date has less than 24 measurement: 2009     1     1
This date has less than 24 measurement: 2009     1     5
于 2013-07-11T01:34:02.940 回答
1

一个快速的解决方案是按年、月、日和 分组unique(),然后计算每天的观察次数,并通过两步逻辑索引accumarray()排除那些少于 24 个 obs 的观察:

% Count observations per day
[unDate,~,subs] = unique(DateV(:,1:3),'rows');
counts = [unDate accumarray(subs,1)]
counts =
        2009           1           1          22
        2009           1           2          24
        2009           1           3          24
        2009           1           4          24
        2009           1           5          23

然后,将条件应用于计数并检索逻辑索引

% index only those that meet criteria
idxC = counts(:,end) == 24
idxC =
      0
      1
      1
      1
      0

% keep those which meet criteria (optional, for visual inspection)
counts(idxC,:)
ans =
        2009           1           2          24
        2009           1           3          24
        2009           1           4          24

最后,通过第二轮逻辑 indexinf找到Dat落入 selected的成员:countsismember()

idxDat = ismember(subs,find(idxC))
Dat(idxDat,:)
于 2013-07-11T08:28:35.533 回答