我有 2 个数据框,其中包含人们购买的产品的序列号,按购买次数排序。第一列是 custId,接下来的 5 列是序列号,按购买的商品数量从左到右排序。
df1
id col1 col2 col3 col4 col5
1 1 4742 927 7889 NA NA
2 2 4964 9295 9174 228 9470
3 3 5834 7758 NA NA NA
4 4 2802 9984 323 NA NA
5 5 179 198 3996 6801 7561
6 6 7755 1252 9684 9940 NA
df2
id col6 col7 col8 col9 col10
1 1 1816 6686 NA NA NA
2 2 6141 9728 6981 3089 5674
3 3 5659 3931 5022 4361 9264
4 4 3210 2488 9939 7543 7757
5 5 9213 1372 4374 7962 4983
6 6 3451 5646 6069 NA NA
我正在尝试将它们合并为一组 5 个序列号,如下所示:
id col1 col2 col3 col4 col5
1 1 4742 927 7889 1816 6686
2 2 4964 9295 9174 228 9470
3 3 5834 7758 5022 4361 9264
4 4 2802 9984 323 7543 7757
5 5 179 198 3996 6801 7561
6 6 7755 1252 9684 9940 3451
有几个问题。
1)如何在一行中找到唯一值。
2)如何保持跨行秩序。
有什么建议么?
> dput(df1)
structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), col1 = c(4742,
4964, 5834, 2802, 179, 7755, 6467, 8671, 2910, 150), col2 = c(927,
9295, 7758, 9984, 198, 1252, 1664, 5242, 6995, 3875), col3 = c(7889,
9174, NA, 323, 3996, 9684, 1150, 2973, 9948, 8598), col4 = c(NA,
228, NA, NA, 6801, 9940, 854, 4744, 4006, 3196), col5 = c(NA,
9470, NA, NA, 7561, NA, 4342, 1791, 286, 7425)), .Names = c("id",
"col1", "col2", "col3", "col4", "col5"), row.names = c(NA, -10L
), class = "data.frame")
> dput(df2)
structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), col6 = c(1816,
6141, 5659, 3210, 9213, 3451, 2440, 5706, 5281, 7110), col7 = c(6686,
9728, 3931, 2488, 1372, 5646, 2641, 7851, 5581, 5775), col8 = c(NA,
6981, 5022, 9939, 4374, 6069, 7525, 4927, 9767, 1331), col9 = c(NA,
3089, 4361, 7543, 7962, NA, 7526, 4215, 9923, 9887), col10 = c(NA,
5674, 9264, 7757, 4983, NA, 9996, 5886, 9546, 9419)), .Names = c("id",
"col6", "col7", "col8", "col9", "col10"), row.names = c(NA, -10L
), class = "data.frame")