0

(链接的 Vlookup 线程没有回答这个问题)

我正在寻找一种方法将一个数据帧 (DF2) 中的值替换为另一个数据帧 (DF1) 中的值,其中 DF2 包含重复条目,但我想保留这些重复项。


作为一个虚构的例子:

假设我有 2 个数据框。其中一个名为 DF1,包含不同日期的酒店雨伞的正确编号。

我们在 5 月 20 日、5 月 25 日、6 月 1 日为 Hilton_A 提供订单项,以及相关的伞#。与 Hilton_B 和 Hilton_C 相同。

这是 DF1 的 dput,参考数据帧:

structure(list(Date = structure(c(15852, 15859, 15852, 15859, 
15852, 15859, 15852), class = "Date"), Hotel = structure(c(1L, 
1L, 2L, 2L, 3L, 3L, 4L), .Label = c("Hilton_A", "Hilton_B", "Hilton_C", 
"Hilton_D"), class = "factor"), Umbrellas = c(9340L, 6401L, 9089L, 
7716L, 5542L, 5565L, 8158L), datename = c("2013-05-27_Hilton_A", 
"2013-06-03_Hilton_A", "2013-05-27_Hilton_B", "2013-06-03_Hilton_B", 
"2013-05-27_Hilton_C", "2013-06-03_Hilton_C", "2013-05-27_Hilton_D"
)), .Names = c("Date", "Hotel", "Umbrellas", "datename"), row.names = c(NA, 
-7L), class = "data.frame")

DF2 包含许多其他酒店在不同日期的信息,以及 DF1 中希尔顿酒店的信息。问题是,DF2 中的伞 # 对希尔顿来说是错误的,我需要用 DF1 中的 # 替换它们。

这是 DF2 的 dput,包含不正确的希尔顿编号,以及我不想触及的其他一些数据:

structure(list(Date = structure(c(15845, 15852, 15859, 15852, 
15859, 15845, 15859, 15845, 15845, 15852, 15845, 15845, 15882
), class = "Date"), Hotel = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("Hilton_A", "Hilton_B", 
"Hilton_C", "Hilton_D", "RedRoof_A", "RedRoof_D", "Sheraton_D"
), class = "factor"), Umbrellas = c(263L, 287L, 258L, 110L, 234L, 
212L, 265L, 542L, 81L, 51L, 162L, 232L, 493L), datename = c("2013-05-20_Hilton_A", 
"2013-05-27_Hilton_A", "2013-06-03_Hilton_A", "2013-05-27_Hilton_A", 
"2013-06-03_Hilton_A", "2013-05-20_Hilton_B", "2013-06-03_Hilton_B", 
"2013-05-20_Hilton_B", "2013-05-20_Hilton_C", "2013-05-27_Hilton_D", 
"2013-05-20_RedRoof_A", "2013-05-20_RedRoof_D", "2013-06-26_Sheraton_D"
)), .Names = c("Date", "Hotel", "Umbrellas", "datename"), row.names = c(NA, 
-13L), class = "data.frame")

通常这会起作用:

DF2$Umbrellas<- replace(DF2$Umbrellas, DF2$datename%in% DF1$datename, DF1$Umbrellas)

(其中“日期名称”只是酒店和日期的串联,因为同一家酒店有多个日期的信息(所以我们可以“唯一=化”列表))

但是 DF2 实际上对我想要保留的每个酒店和日期都有多个观察结果(即,5/27 的 Hilton_A 在 DF2 中出现了 2 次)。

因此,当我尝试将 Umbrella # 从 DF1 替换为 DF2 时,我收到错误消息:

Warning message:
In replace(DF2$Umbrellas, DF2$hoteldatename %in% DF1$hoteldatename ,  :
  number of items to replace is not a multiple of replacement length

而且数字都错了。

有谁知道这里发生了什么以及我如何获取 DF1 中的数字来替换 DF2 中所有适用的观察结果?

4

1 回答 1

1
df3$Umbrellas<-df1$Umbrellas[match(df2$datename,df1$datename)]
> df3
         Date      Hotel Umbrellas              datename
1  2013-05-20   Hilton_A        NA   2013-05-20_Hilton_A
2  2013-05-27   Hilton_A      9340   2013-05-27_Hilton_A
3  2013-06-03   Hilton_A      6401   2013-06-03_Hilton_A
4  2013-05-27   Hilton_A      9340   2013-05-27_Hilton_A
5  2013-06-03   Hilton_A      6401   2013-06-03_Hilton_A
6  2013-05-20   Hilton_B        NA   2013-05-20_Hilton_B
7  2013-06-03   Hilton_B      7716   2013-06-03_Hilton_B
8  2013-05-20   Hilton_B        NA   2013-05-20_Hilton_B
9  2013-05-20   Hilton_C        NA   2013-05-20_Hilton_C
10 2013-05-27   Hilton_D      8158   2013-05-27_Hilton_D
11 2013-05-20  RedRoof_A        NA  2013-05-20_RedRoof_A
12 2013-05-20  RedRoof_D        NA  2013-05-20_RedRoof_D
13 2013-06-26 Sheraton_D        NA 2013-06-26_Sheraton_D

df3$Umbrellas<-ifelse(is.na(df3$Umbrellas),df2$Umbrellas,df3$Umbrellas)
> df3
         Date      Hotel Umbrellas              datename
1  2013-05-20   Hilton_A       263   2013-05-20_Hilton_A
2  2013-05-27   Hilton_A      9340   2013-05-27_Hilton_A
3  2013-06-03   Hilton_A      6401   2013-06-03_Hilton_A
4  2013-05-27   Hilton_A      9340   2013-05-27_Hilton_A
5  2013-06-03   Hilton_A      6401   2013-06-03_Hilton_A
6  2013-05-20   Hilton_B       212   2013-05-20_Hilton_B
7  2013-06-03   Hilton_B      7716   2013-06-03_Hilton_B
8  2013-05-20   Hilton_B       542   2013-05-20_Hilton_B
9  2013-05-20   Hilton_C        81   2013-05-20_Hilton_C
10 2013-05-27   Hilton_D      8158   2013-05-27_Hilton_D
11 2013-05-20  RedRoof_A       162  2013-05-20_RedRoof_A
12 2013-05-20  RedRoof_D       232  2013-05-20_RedRoof_D
13 2013-06-26 Sheraton_D       493 2013-06-26_Sheraton_D
于 2013-10-26T23:55:19.840 回答