matlab - 在 Matlab 中连接来自不同单元格数组的数据

Question

我在 Matlab 中有单元格数组格式的数据，其中的列代表不同的项目。元胞数组具有不同的列，如下例所示：

a = {'A', 'B', 'C' ; 1, 1, 1; 2, 2, 2 }

一个=

'A'    'B'    'C'
[1]    [1]    [1]
[2]    [2]    [2]

b = {'C', 'D'; 3, 3; 4, 4}

b =

'C'    'D'
[3]    [3]
[4]    [4]

我希望能够以下列方式加入不同的单元阵列：

c =

'A'    'B'    'C'    'D'
[1]    [1]    [1]    [NaN]
[2]    [2]    [2]    [NaN]
[NaN]  [NaN]  [3]    [3]
[NaN]  [NaN]  [4]    [4]

在实际示例中，我有数百列和几行，因此手动创建新的单元格数组不是我的选择。

score 3 · Accepted Answer

我假设您只想根据它们的第一行加入这两个数组。

% get the list of all keys
keys = unique([a(1,:) b(1,:)]);

lena = size(a,1)-1;  lenb = size(b,1)-1;

% allocate space for the joined array
joined = cell(lena+lenb+1, length(keys));

joined(1,:) = keys;

% add a
tf = ismember(keys, a(1,:));
joined(2:(2+lena-1),tf) = a(2:end,:);

% add b
tf = ismember(keys, b(1,:));
joined((lena+2):(lena+lenb+1),tf) = b(2:end,:);

这将为您提供连接的数组，除了它有空单元格而不是 NaN。我希望这没问题。

score 3 · Accepted Answer

如果您愿意将数据存储在数据集数组中（或为此目的将它们转换为数据集数组），您可以执行以下操作：

>> d1
d1 = 
    A    B    C
    1    1    1
    2    2    2
>> d2
d2 = 
    C    D
    3    3
    4    4
>> join(d1,d2,'Keys','C','type','outer','mergekeys',true)
ans = 
    A      B      C    D  
      1      1    1    NaN
      2      2    2    NaN
    NaN    NaN    3      3
    NaN    NaN    4      4

score 1 · Accepted Answer

Here is my solution adapted from an old another to a similar question (simply transpose rows/columns):

%# input cell arrays
a = {'A', 'B', 'C' ; 1, 1, 1; 2, 2, 2 };
b = {'C', 'D'; 3, 3; 4, 4};

%# transpose rows/columns
a = a'; b = b';

%# get all key values, and convert them to indices starting at 1
[allKeys,~,ind] = unique( [a(:,1);b(:,1)] );
indA = ind(1:size(a,1));
indB = ind(size(a,1)+1:end);

%# merge the two datasets (key,value1,value2)
c = cell(numel(allKeys), size(a,2)+size(b,2)-1);
c(:) = {NaN};                         %# fill with NaNs
c(:,1) = allKeys;                     %# available keys from both
c(indA,2:size(a,2)) = a(:,2:end);     %# insert 1st dataset values
c(indB,size(a,2)+1:end) = b(:,2:end); %# insert 2nd dataset values

Here is the result (transposed to match original orientation):

>> c'
ans = 
    'A'      'B'      'C'    'D'  
    [  1]    [  1]    [1]    [NaN]
    [  2]    [  2]    [2]    [NaN]
    [NaN]    [NaN]    [3]    [  3]
    [NaN]    [NaN]    [4]    [  4]

Also here is the solution using the DATASET class from the Statistics Toolbox:

aa = dataset([cell2mat(a(2:end,:)) a(1,:)])
bb = dataset([cell2mat(b(2:end,:)) b(1,:)])
cc = join(aa,bb, 'Keys',{'C'}, 'type','fullouter', 'MergeKeys',true)

with

cc = 
    A      B      C    D  
      1      1    1    NaN
      2      2    2    NaN
    NaN    NaN    3      3
    NaN    NaN    4      4

matlab - 在 Matlab 中连接来自不同单元格数组的数据

3 回答 3

Related

Reference