1

I observed the length of 5 types of nursing care however now have 5 groups of differing sample sizes, just because type 1 care took place more often.

So when I run [P,ANOVATAB,STATS]=kruskalwallis([rand(10,1) rand(30,1)])

I get : Error using horzcat CAT arguments dimensions are not consistent.

Why do unequal sample sizes matter and what should I do instead?

4

1 回答 1

4

According to the kruskalwallis documentation, you need to pass it two vectors of the same length: one containing all the data, and another of the same length containing the group indices of each data point. So, if you were to call the two sets of data in your example groups 1 and 2, you could do:

data1 = rand(10,1);
data2 = rand(30,1);
% Concatenation with a ; in between because these are **column** vectors
allData = [data1; data2]; 
groups = [ones(size(data1)); 2 * ones(size(data2))];
[P,ANOVATAB,STATS] = kruskalwallis(allData, groups);

Please have a read of the documentation for Creating and Concatenating Matrices as well.

If you want to get a bit more fancy and a bit more general (e.g., in the case where you don't know how many groups you have until run time), you could use a cell array to initially store your data groups, like so:

% Initialise cell array with differently dimensioned data
xc{1} = rand(100, 1);
xc{2} = rand(1, 30);

% Reshape it all to column vectors and concatenate
allData = cellfun(@(x)x(:), xc, 'UniformOutput', false);
allData = vertcat(allData{:});

% Generate group indices for each set of data as column vectors and
% concatenate
groups = arrayfun(@(x, y)y * ones(numel(x{:}), 1), xc, 1:length(xc), 'UniformOutput', false);
groups = vertcat(groups{:});

As mentioned in the comments, this will also work if each of your data sets has different dimensions (i.e., in this example one is a row vector and one is a column vector).

于 2013-03-30T10:20:27.273 回答