8

I have a numpy array:

a = [3., 0., 4., 2., 0., 0., 0.]

I would like a new array, created from this, where the non zero elements are converted to their value in zeros and zero elements are converted to a single number equal to the number of consecutive zeros i.e:

b = [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 3.]

Looking for a vectorized way to do this as the array will have > 1 million elements. Any help much appreciated.

4

2 回答 2

8

这应该可以解决问题,它的大致工作原理是:1)找到所有连续的零并计算它们,2)计算输出数组的大小并用零初始化它,3)将第 1 部分的计数放在正确的位置。

def cz(a):
    a = np.asarray(a, int)

    # Find where sequences of zeros start and end
    wz = np.zeros(len(a) + 2, dtype=bool)
    wz[1:-1] = a == 0
    change = wz[1:] != wz[:-1]
    edges = np.where(change)[0]
    # Take the difference to get the number of zeros in each sequence
    consecutive_zeros = edges[1::2] - edges[::2]

    # Figure out where to put consecutive_zeros
    idx = a.cumsum()
    n = idx[-1] if len(idx) > 0 else 0
    idx = idx[edges[::2]]
    idx += np.arange(len(idx))

    # Create output array and populate with values for consecutive_zeros
    out = np.zeros(len(consecutive_zeros) + n)
    out[idx] = consecutive_zeros
    return out
于 2013-10-17T00:57:37.807 回答
4

对于一些品种:

a = np.array([3., 0., 4., 2., 0., 0., 0.],dtype=np.int)

inds = np.cumsum(a)

#Find first occurrences and values thereof.
uvals,zero_pos = np.unique(inds,return_index=True)
zero_pos = np.hstack((zero_pos,a.shape[0]))+1

#Gets zero lengths
values =  np.diff(zero_pos)-1
mask = (uvals!=0)

#Ignore where we have consecutive values
zero_inds = uvals[mask]
zero_inds += np.arange(zero_inds.shape[0])

#Create output array and apply zero values
out = np.zeros(inds[-1] + zero_inds.shape[0])
out[zero_inds] = values[mask]

out
[ 0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  3.]

主要变化在于我们可以np.unique用来查找数组的第一次出现,只要它是单调递增的。

于 2013-10-17T01:57:12.700 回答