0

我试图从数据框生成句子下面是数据框

# Code
mycode <- c("AAABBB", "AAABBB", "AAACCC", "AAABBD")
mycode <- sample(mycode, 20, replace = TRUE)

# Date
mydate <-c("2016-10-17","2016-10-18","2016-10-19","2016-10-20")
mydate <-sample(mydate, 20, replace = TRUE)

# resort
myresort <-c("GB","IE","GR","DK")
myresort <-sample(myresort, 20, replace = TRUE)

# Number of holidaymakers
HolidayMakers <- sample(1000, 20, replace = TRUE)

mydf <- data.frame(mycode,
                  mydate,
                  myresort,
                  HolidayMakers)

因此,如果我们以mycode一个例子为例,我想创建一个句子,如“对于代码mycode,最大的目的地是myresorts访问次数最多mydate的地方HolidayMakers

如果我们假设每个代码有多行。我想要的是一个句子,例如,而不是每个mydateand一个句子myresort,我想说一些类似的话

“对于代码 AAABBB,最大的目的地是 GB、GR、DK、IE,其中访问最多的天数是 2016-10-17,2016-10-18,2016-10-19,总共 650 天”

650 基本上是每个 mycode 在那些日子里所有这些国家的所有度假者的总和

有人帮忙吗?

感谢您的时间

4

1 回答 1

2

你可以试试:

library(dplyr)
res <- mydf %>%
  group_by(mycode) %>%
  summarise(d = toString(unique(mydate)), 
            r = toString(unique(myresort)), 
            h = sum(HolidayMakers)) %>%
  mutate(s = paste("For the code", mycode, 
                   "the biggest destinations are", r, 
                   "where the top days of visiting were", d, 
                   "with a total of", h))

这使:

> res$s

#[1] "For the code AAABBB the biggest destinations are GB, GR, IE, DK 
#     where the top days of visiting were 2016-10-17, 2016-10-18, 
#     2016-10-20, 2016-10-19 with a total of 6577"
#[2] "For the code AAABBD the biggest destinations are IE 
#     where the top days of visiting were 2016-10-17, 2016-10-18 
#     with a total of 1925"                                    
#[3] "For the code AAACCC the biggest destinations are IE, GR, DK 
#     where the top days of visiting were 2016-10-20, 2016-10-17, 
#     2016-10-19, 2016-10-18 with a total of 2878"    

注意:由于您没有提供有关如何计算“最高访问天数”的任何指导,因此我只是将所有天数包括在内。您可以轻松编辑以上内容以适合您的实际情况。

于 2016-10-26T11:06:11.647 回答