I am looking for a faster way to achieve the operation below. The dataset contains > 1M rows but I have provided a simplified example to illustrate the task --
To create the data table --
dt <- data.table(name=c("john","jill"), a1=c(1,4), a2=c(2,5), a3=c(3,6),
b1=c(10,40), b2=c(20,50), b3=c(30,60))
colGroups <- c("a","b") # Columns starting in "a", and in "b"
Original Dataset
-----------------------------------
name a1 a2 a3 b1 b2 b3
john 1 2 3 10 20 30
jill 4 5 6 40 50 60
The above dataset is transformed such that 2 new rows are added for each unique name and in each row, the values are left shifted for each group of columns independently (in this example I have used a columns and b columns but there are many more)
Transformed Dataset
-----------------------------------
name a1 a2 a3 b1 b2 b3
john 1 2 3 10 20 30 # First Row for John
john 2 3 0 20 30 0 # "a" values left shifted, "b" values left shifted
john 3 0 0 30 0 0 # Same as above, left-shifted again
jill 4 5 6 40 50 60 # Repeated for Jill
jill 5 6 0 50 60 0
jill 6 0 0 60 0 0
And so on. My dataset is extremely large, which is why I am trying to see if there is an efficient way to implement this.
Thanks in advance.