r - 重塑、聚合/连接字符串

Question

我正在以国家/地区年份格式汇总数据集

melted <- melt(data, id = c("ccode.a","year"))

data.fix <- function(x) c(max = max(x), sum = sum(x), min = min(x),
                          newcol = paste(x, sep = ","))
casted <- cast(melted, ccode.a + year ~ ..., data.fix)

我想进行连接conflictID.a，以便对于将多行聚合为单行的实例，我得到聚合的所有值conflictID.a。

以下是一些示例数据：

dput(tail(subset(data, select=c(ccode.a,year,onset,conflictID.a)), 100))

我也人为地修改了数据以重现问题。因此，在两种情况下，有 2 行或更多行具有相同的year和ccode.a值，但conflictID.a值不同，我想在聚合时将它们连接在一起，每个ccode.a, year.

structure(list(ccode.a = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 41L, 41L, 
41L, 52L, 52L, 70L, 70L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 
90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 
92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 
93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 93L, 95L, 
95L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 
100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 
100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 
100L, 100L, 101L, 101L, 115L, 130L), year = c(2001, 2001, 2001, 
2005, 2006, 2007, 2008, 1989, 1991, 2004, 1990, 1990, 1994, 1996, 
1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 
1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1979, 
1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 
1991, 1977, 1978, 1979, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 
1989, 1990, 1989, 1989, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 
1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 
1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 
2004, 2005, 2006, 2007, 2008, 1982, 1982, 1982, 1995), onset = c(1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), conflictID.a = c(224L, 
224L, 224L, 224L, 224L, 224L, 224L, 186L, 186L, 186L, 183L, 183L, 
205L, 205L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 
36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 120L, 
120L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 
120L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 140L, 
140L, 140L, 173L, 172L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 
92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 
92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 
80L, 80L, 162L, 208L)), .Names = c("ccode.a", "year", "onset", 
"conflictID.a"), row.names = c(127L, 128L, 130L, 131L, 132L, 
133L, 134L, 277L, 279L, 292L, 395L, 396L, 452L, 454L, 494L, 495L, 
496L, 497L, 498L, 499L, 500L, 501L, 502L, 503L, 504L, 505L, 506L, 
507L, 508L, 509L, 510L, 511L, 512L, 513L, 514L, 566L, 567L, 568L, 
569L, 570L, 571L, 572L, 573L, 574L, 575L, 576L, 577L, 578L, 598L, 
599L, 600L, 603L, 604L, 605L, 606L, 607L, 608L, 609L, 610L, 611L, 
678L, 679L, 699L, 700L, 701L, 702L, 703L, 704L, 705L, 706L, 707L, 
708L, 709L, 710L, 711L, 712L, 713L, 714L, 715L, 716L, 717L, 718L, 
719L, 720L, 721L, 722L, 723L, 724L, 725L, 726L, 727L, 728L, 729L, 
730L, 731L, 732L, 740L, 750L, 812L, 854L), class = "data.frame")

score 2 · Accepted Answer

你不需要reshape这个，只需使用 plain aggregate。

# All aggregated values
aggregate(data$conflictID.a,by=list(data$ccode.a,data$year),c)
# Just unique values
aggregate(data$conflictID.a,by=list(data$ccode.a,data$year),unique)

r - 重塑、聚合/连接字符串

1 回答 1

Related

Reference