1

我有一个简单的矩阵,在某些列中有重复值。我需要按名称和周对数据进行分组,并将给定周内每天花费的价格相加。这是示例:

 name day  week  price
 John 12   12    200
 John 14   12    70
 John 25   13    150
 John 1    14    10
 Ann  13   12    100
 Ann  15   12    100
 Ann  20   13    50

所需的输出将是:

  name week sum
  John 12   270
  John 13   150
  John 14   10
  Ann  12   200
  Ann  13   50

有什么好的方法吗?我使用了 for 循环,但不确定这是最好的方法:

names= unique(data(:,1)); % getting unique names from data
n=size(names, 1);         % number of unique names
m=size(data(:,1),1);      % number of total rows
sum=[];                   % empty matrix for writing the results
for i = 1:n             
        temp=[];          % creating temporar matrix  
        k=1;
    for j=1:m
        if name(i)==data(j,1)     % going through all the rows and getting the rows of 
            temp(k,:)=data(j,:);  % the same name and putting in temporar matrix
            k=k+1;
        end
    end
    count=0;
    s=1;
    for l = 1:size(temp,1)-1      % going through temporar matrix of one name(e.g.John)
        if temp(l,3)==temp(l+1,3) % checking if the day of current row is equal to the
         count=count+temp(l,4);   % date of the next row (the data is sorted by name 
        else                      % and date) and then summing the prices 4th column
            sum(s, 1:3)=[names(i) temp(l,3) count];  
            count=0;              % if the days are not equal, then writing the answer
            s=s+1;                % to the output matrix sum
        end        
    end 
end  
4

2 回答 2

3

使用accumarray. 它将像这样对值进行分组和聚合。您可以使用第三个 otuput 参数unique(data(:,1))来获取数字索引以传递给 的subs参数accumarray。详情请参阅doc accumarray

于 2012-03-28T20:38:58.387 回答
1

可能最简单的方法是使用Statistical Toolbox 中的GRPSTATS函数。您必须组合nameweekfirst 来生成组:

[name_week priceSum] = grpstats(price, strcat(name(:), '@', week(:)), {'gname','sum'});
于 2012-03-28T20:41:26.273 回答