Say I have this data.frame
data <- data.frame(foo = c(1, 1, 2, 2 ),
bar = c(10,10,10,20),
baz = c(1, 2, 3, 4 ),
qux = c(5, 6, 7, 8 ))
I want to group it by the foo
and bar
columns to arrive at this:
expected <- list(
data.frame(foo = c(1, 1),
bar = c(10, 10),
baz = c(1, 2),
qux = c(5, 6)),
data.frame(foo = 2,
bar = 10,
baz = 3,
qux = 7),
data.frame(foo = 2,
bar = 20,
baz = 4,
qux = 8)
)
I can generate a frame with a row for each group, but I couldn't find a MATCH
function; something that when given an input frame with columns foo,bar,baz,qux
and a filter frame with columns foo,bar
returns the rows where the foo,bar
cell's content matches.
groups <- unique(data[c("foo","bar")])
MATCH(data, groups[1,]) == expected[[1]]
MATCH(data, groups[2,]) == expected[[2]]
MATCH(data, groups[3,]) == expected[[3]]
Or a higher level GROUP
function which just returns a list of frames, where the columns given match:
GROUP(data, by=c("foo","bar")) == expected
The closest I came to that is
out <- aggregate(. ~ foo + bar, data, list)
Where the cells baz
, qux
are lists:
> out
foo bar baz qux
1 1 10 1, 2 5, 6
2 2 10 3 7
3 2 20 4 8
> class(out[,"baz"])
[1] "list"
So each group is a row in out
, but how do I unfold this again, so that out[1,]
becomes a data.frame with two rows, like expected[[1]]
?