Suppose I have this:
>>> x = pandas.DataFrame([[1.0, 2.0, 3.0], [3, 4, 5]], columns=["A", "B", "C"])
>>> print x
A B C
0 1 2 3
1 3 4 5
Now I want to normalize x
by row --- that is, divide each row by its sum. As described in this question, this can be achieved with x = x.div(x.sum(axis=1), axis=0)
. However, this creates a new DataFrame. If my DataFrame is large, a lot of memory can be consumed in creating this new DataFrame, even though I immediately assign it to the original name.
Is there an efficient way to perform this operation in place? I want something like x.idiv()
that provides the axis
option of div
but updates x
in place. For this specific case I need the division, but sometimes it would also be nice to have similar in-place versions for all the basic operations.
(I can update it in place by iterating over it row-wise and assigning each normalized row back into the original, but this is slow, and I'm looking for a more efficient solution.)