3

处理特定国家的数据。需要将国家分配和分组到预定义的国家组中。编写代码如下。想知道是否有更有效的脚本编写方法,每次进入数据库时​​不输入每个新国家到分配到非核心组的部分?听起来像如果其他。但不知道如何编码。

library(data.table)
data<- data.table(data)
setkey(data,Region.Group)
data[list(c(
  "Australia",
  "Bangladesh",
  "Cambodia",
  "Estonia",
  "Finland",
  "France",
  "India",
  "Indonesia",
  "Korea",
  "Lithuania",
  "Malaysia",
  "Middle East",
  "Norway",
  "Philippines",
  "Poland",
  "Russia",
  "Spain",
  "Sri Lanka",
  "Sweden",
  "Switzerland",
  "TAT Region",
  "Thailand",
  "Ukraine",
  "Vietnam",
  "New Zealand",
  "Israel",
  "Myanmar",
  "Pakistan",
  "Portugal",
  "Turkey",
  "Portugal")), Core:="NON-CORE"]
data[list(c(
  "Belgium",
  "Netherlands")), Core:="Benelux"]
data[list(c(
  "China Group")), Core:="China"]
data[list(c(
  "Germany")), Core:="Germany"]
data[list(c(
  "Hong Kong Group")), Core:="Hong Kong"]
data[list(c(
  "Italy")), Core:="Italy"]
data[list(c(
  "Japan")), Core:="Japan"]
data[list(c(
  "North America Central",
  "North America East",
  "North America North",
  "North America South",
  "North America West")), Core:="N.America"]
data[list(c(
  "Singapore")), Core:="Singapore"]
data[list(c(
  "Taiwan")), Core:="Taiwan"]
data[list(c(
  "United Kingdom")), Core:="UK"]
4

1 回答 1

2

我想您需要在某个时候将该国家/地区归入正确的组中。一个列表(此处缩短)怎么样,我们不费心放置非核心国家:

coregroup <- list(
    Benelux     =   c("Belgium","Netherlands"),
    Germany     =   "Germany"
)

然后你可以data.table从这个列表中做出一个

dt_coregroup <- data.table(
    Core=rep(names(coregroup),lapply(coregroup,length)),
    Region.Group=unlist(coregroup)
)
#       Core Region.Group
# 1: Benelux      Belgium
# 2: Benelux  Netherlands
# 3: Germany      Germany

并将其合并回您的原始数据。我放入了一些无意义的数据并将其重命名为“dt_start”,因为显然“数据”已经是一个 R 函数。

dt_start <- data.table(Region.Group=c("Germany","Belgium","Australia"),Period=rep("2013Q1",3),Qty1=1:3)
setkey(dt_start,Region.Group)
setkey(dt_coregroup,Region.Group)

dt_new <- dt_coregroup[dt_start]
#    Region.Group    Core Period Qty1
# 1:    Australia      NA 2013Q1    3
# 2:      Belgium Benelux 2013Q1    2
# 3:      Germany Germany 2013Q1    1

最后,在最后一步,我们将任何未分​​组的国家分配给 NON-CORE:

dt_new[is.na(Core),Core:="NON-CORE"]
#    Region.Group     Core Period Qty1
# 1:    Australia NON-CORE 2013Q1    3
# 2:      Belgium  Benelux 2013Q1    2
# 3:      Germany  Germany 2013Q1    1
于 2013-05-18T03:29:56.520 回答