I have a data set that looks something like this:
     id1  id2   size
1   5400 5505      7
2   5033 5458      1
3   5452 2873     24
4   5452 5213      2
5   5452 4242     26
6   4823 4823      4
7   5505 5400     11
Where id1 and id2 are unique nodes in a graph, and size is a value assigned to the directed edge connecting them from id1 to id2. This data set is fairly large (a little over 2 million rows). What I would like to do is sum the size column, grouped by unordered node pairs of id1 and id2. For example, in the first row, we have id1=5400 and id2=5505. There exists another row in the data frame where id1=5505 and id2=5400. In the grouped data, the sum of the size columns for these two rows would be added to a single row. So in other words I want to summarize the data where I'm grouping on an (unordered) set of (id1,id2). I've found a way to do this using apply with a custom function that checks for the reversed column pair in the full data set, but this works excruciatingly slow. Does anyone know of a way to do this another way, perhaps with plyr or with something in the base packages that would be more efficient?