0

我有这个名为 tmp.df.lhs.denorm 的数据表,我在前面提供了前 2 行:

    > dput(tmp.df.lhs.denorm[1:2])
structure(list(rules = c("{} => {Dental anesthetic products-Injectables cartridges|2288210-Septocaine Cart 4% w/EPI}", 
"{Dental small equipment-Water distiller parts & acc|5528005-EzeeKleen 2.5HD UV Lamp1,Dental small equipment-Water distiller parts & acc|5528005-EzeeKleen 2.5HD UV Lamp2} => {Dental small equipment-Water distiller parts & acc|5528004-EzeeKleen 2.5HD RO Membra}"
), support = c(0.501710236989983, 0.000610798924993892), confidence = c(0.501710236989983, 
1), lift = c(1, 1637.2), rule.id = 1:2, lhs_1 = c(NA, "Dental small equipment-Water distiller parts & acc|5528005-EzeeKleen 2.5HD UV Lamp1"
), lhs_2 = c(NA, "Dental small equipment-Water distiller parts & acc|5528005-EzeeKleen 2.5HD UV Lamp2"
)), .Names = c("rules", "support", "confidence", "lift", "rule.id", 
"lhs_1", "lhs_2"), class = c("data.table", "data.frame"), row.names = c(NA, 
-2L), .internal.selfref = <pointer: 0x0000000007120788>)

注意列 lhs_1 和 lhs_2 是 str split 在列规则上的产物。

我的问题是,对于不同的数据,列规则可能包含由逗号分隔的不同数量的规则,例如,我可以得到 3 列 lhs_1 、 lhs_2 和 lhs_3 等等,这取决于我在列规则中有多少个逗号。解决方案是确定固定数量的 lhs_* 列(我的代码中的参数,假设为 6),其中这个特定示例 dt tmp.df.lhs.denorm 将与名为 lhs_3、lhs_4 的额外 4 个空列合并, lhs_5 和 lhs_6。任何帮助表示赞赏

4

1 回答 1

0

我找到了一种解决方法:

tmp.df.lhs.denorm.art <- data.table(rules = character(),
                                         support = numeric(),
                                         confidence = numeric(),
                                         lift = numeric(),
                                         rule.id = integer(),
                                        lhs_1 = character(),
                                        lhs_2 = character(),
                                        lhs_3 = character(),
                                        lhs_4 = character(),
                                        lhs_5 = character(),
                                        lhs_6 = character()
                                      )
  tmp.df.lhs.denorm.complete <- rbindlist(list(tmp.df.lhs.denorm, tmp.df.lhs.denorm.art), fill=TRUE)
于 2016-12-25T12:29:00.150 回答