matlab - MATLAB：随着时间戳的增加，向量中的重复元素

Question

我有一个矩阵，其中一列包含数据（每秒一个样本），另一列包含以秒为单位的时间戳。有几秒钟的数据与上一个数据相比没有变化，因此不会出现在向量上。我想将一个函数（例如简单平均值）应用于时间间隔（例如 30 秒）。但是为此，我必须计算丢失的秒数。最好的方法是什么？

首先创建一个包含重复元素的矩阵（我还希望包含丢失秒数的正确时间戳 - 最难的部分），然后才计算平均值；

或者

在插入缺失样本时使用循环（我想是最糟糕的方式）来计算平均值；

提前致谢！

ps.：或者是否可以将函数应用于识别并自动引入（通过重复）丢失数据的数据？

score 2 · Accepted Answer

您可以通过加权平均使用diff和的组合来包含“缺失”条目：sum

% time step
step = 1;

% Example data (with repeated elements)
A = [...
     1 80.6
     2 79.8
     3 40.3
     4 40.3
     5 81.9
     6 83.6
     7 83.7
     8 95.4
     9 14.8
    10 14.8
    11 14.8
    12 14.8
    13 14.8
    14 44.3];

% Example data, with the repeated elements removed
B = [...
     1 80.6
     2 79.8
     3 40.3     
     5 81.9
     6 83.6
     7 83.7
     8 95.4
     9 14.8    
    14 44.3];

% The correct value
M = mean(A(:,2))

% The correct value for the pruned data
D = diff(B(:,1));
W = [D/step; 1]; 
M = sum( W .* B(:,2))/sum(W)

结果：

M1 =
    5.027857142857141e+001
M2 =
    5.027857142857143e+001

或者，您可以通过运行长度编码A从缩写重新创建完整向量。B您可以像这样有效地做到这一点：

W = [diff(B(:,1))/step; 1];
idx([cumsum([true; W(W>0)])]) = true;

A_new = [ (B(1,1):step:B(end,1)).'  B(cumsum(idx(1:find(idx,1,'last')-1)),2) ];

score 1 · Accepted Answer

您可以为每个样本赋予一个权重，以反映它实际代表的样本数量。这样的权重可以用计算diff：

data = [1 1; 0 2; 3 5; 4 7]; % Example data. Second column is timestamp

weights = diff([data(:,2); data(end,2)+1]); % We assume the last sample
% only represents itself
result = sum(data(:,1).*weights)/sum(weights);

matlab - MATLAB：随着时间戳的增加，向量中的重复元素

2 回答 2

Related

Reference