python - Efficient 2d cumsum

Question

Say I have an array like this

>>> a = np.arange(1,8).reshape((1,-1))
>>> a
array([[1, 2, 3, 4, 5, 6, 7]])

and I want to create, for each of the items in a, a "cumsum of the next 4 items". That is, my expected output is

1,       2,      3, 4, 5, 6, 7, 8
1+2,     2+3,     ...
1+2+3    2+3+4    ...
1+2+3+4  2+3+4+5  ...

i.e. a matrix that contains

1, 2, 3, 4, 5, 0, 0, 0
3, 5, 7, 9, 11,0, 0, 0
6, 9, 12,15,18,0, 0, 0
10,14,18,21,26,0, 0, 0

Since the cumsum operation cannot be correctly done for the last 3 items, I expect a 0 there. I know how to do a single cumsum. In fact, the arrays are

a[:4].cumsum().reshape((-1,1)); a[1:5].cumsum().reshape((-1,1))...

stacked horizontally. However, I don't know how to do this in an efficient way. What would be the nice vectorized numpy way of doing this? I'm also open for scipy packages, as long as they dominate numpy in terms of efficiency or readability.

score 2 · Accepted Answer

You can do your calculations efficiently using a simpler variant of a technique called summed area table, also known as integral image in image processing applications. First you calculate and store your summed area table, a complete cumsum of your first row with a 0 added in front:

a = np.arange(1, 8)
cs = np.concatenate(([0], np.cumsum(a)))

And you can now create each of your "cumsum of the next n items" as cs[:n] - cs[:-n]:

>>> for n in range(1, 5):
...     print n, '-->', (cs[n:] - cs[:-n])[:4]
...
1 --> [1 2 3 4]
2 --> [3 5 7 9]
3 --> [ 6  9 12 15]
4 --> [10 14 18 22]

You'll need to properly arrange them in the shape you want, but once the original calculation is done, you can compute each item of your output with a single subtraction, which is about as efficient as it can get.

score 1 · Accepted Answer

One possible way would to use a rolling window approach combined with cumsum().

For example:

from numpy.lib.stride_tricks import as_strided

a = np.arange(1, 9) # the starting array
slice_length = 4

Then you could write:

arr = as_strided(a, (slice_length, len(a)), (a.strides[0], a.strides[0])).cumsum(axis=0)

This gets you most of the way there, but to fill in the remaining 0 values, you can use slice and assign to get your desired output:

arr[:, (1-slice_length):] = 0

Then you have the array:

>>> arr
array([[ 1,  2,  3,  4,  5,  0,  0,  0],
       [ 3,  5,  7,  9, 11,  0,  0,  0],
       [ 6,  9, 12, 15, 18,  0,  0,  0],
       [10, 14, 18, 22, 26,  0,  0,  0]])

I don't know if there is any way to produce exactly your desired output with one single vectorised method in NumPy (i.e. without the slicing). (accumulateat, a bit like reduceat, might be an interesting thing to add to NumPy's ufuncs...)

score 0 · Accepted Answer

You can use broadcasting like so -

In [53]: a
Out[53]: array([ 4, 13,  4, 18,  1,  2, 11, 15])

In [54]: WSZ = 4 # Window size

In [55]: idx = np.arange(WSZ)[:,None] + np.arange(a.size-WSZ+1) # Broadcasted indices

In [56]: a[idx].cumsum(axis=0) # Index into "a" & perform cumsum along axis-0
Out[56]: 
array([[ 4, 13,  4, 18,  1],
       [17, 17, 22, 19,  3],
       [21, 35, 23, 21, 14],
       [39, 36, 25, 32, 29]], dtype=int32)

Pad with zeros if needed -

In [57]: np.lib.pad(a[idx].cumsum(0),((0,0),(0,WSZ-1)),'constant',constant_values=0)
Out[57]: 
array([[ 4, 13,  4, 18,  1,  0,  0,  0],
       [17, 17, 22, 19,  3,  0,  0,  0],
       [21, 35, 23, 21, 14,  0,  0,  0],
       [39, 36, 25, 32, 29,  0,  0,  0]], dtype=int32)

python - Efficient 2d cumsum

3 回答 3

Related

Reference