2

我有一个有趣的谜题。假设您有一个 numpy 二维数组,其中每一行对应一个测量事件,每一列对应不同的测量变量。此数组中的另一列指定进行测量的日期。这些行根据时间戳进行排序。每天有几次(或多次)测量。目标是识别对应于新一天的行,并从当天的后续行中减去相应的值。

我通过一个循环来解决这个问题,该循环在几天内循环,创建一个布尔向量来选择正确的行,然后减去第一条选定的行。这种方法有效,但感觉不优雅。有没有更好的方法来做到这一点?

只是一个小例子。下面的行定义了一个矩阵,其中第一列是日期,其余两列是测量值

before = array([[ 1,  1,  2],
   [ 1,  3,  4],
   [ 1,  5,  6],
   [ 2,  7,  8],
   [ 3,  9, 10],
   [ 3, 11, 12],
   [ 3, 13, 14]])

在该过程结束时,我希望看到以下数组:

array([[1, 0, 0],
   [1, 2, 2],
   [1, 4, 4],
   [2, 0, 0],
   [3, 0, 0],
   [3, 2, 2],
   [3, 4, 4]])

PS 请帮我为这篇文章找到一个更好、信息更丰富的标题。我没主意了

4

1 回答 1

4

numpy.searchsorted is a convenient function for this:

In : before
Out:
array([[ 1,  1,  2],
       [ 1,  3,  4],
       [ 1,  5,  6],
       [ 2,  7,  8],
       [ 3,  9, 10],
       [ 3, 11, 12],
       [ 3, 13, 14]])

In : diff = before[before[:,0].searchsorted(x[:,0])]

In : diff[:,0] = 0

In : before - diff
Out:
array([[1, 0, 0],
       [1, 2, 2],
       [1, 4, 4],
       [2, 0, 0],
       [3, 0, 0],
       [3, 2, 2],
       [3, 4, 4]])

Longer explanation

If you take the first column, and search for itself you get the minimum indices for those particular values:

In : before
Out:
array([[ 1,  1,  2],
       [ 1,  3,  4],
       [ 1,  5,  6],
       [ 2,  7,  8],
       [ 3,  9, 10],
       [ 3, 11, 12],
       [ 3, 13, 14]])

In : before[:,0].searchsorted(x[:,0])
Out: array([0, 0, 0, 3, 4, 4, 4])

You can then use this to construct the matrix that you will subtract by indexing:

In : diff = before[before[:,0].searchsorted(x[:,0])]

In : diff
Out:
array([[ 1,  1,  2],
       [ 1,  1,  2],
       [ 1,  1,  2],
       [ 2,  7,  8],
       [ 3,  9, 10],
       [ 3,  9, 10],
       [ 3,  9, 10]])

You need to make the first column 0 so that they won't be subtracted.

In : diff[:,0] = 0

In : diff
Out:
array([[ 0,  1,  2],
       [ 0,  1,  2],
       [ 0,  1,  2],
       [ 0,  7,  8],
       [ 0,  9, 10],
       [ 0,  9, 10],
       [ 0,  9, 10]])

Finally, subtract two matrices to get the desired output:

In : before - diff
Out:
array([[1, 0, 0],
       [1, 2, 2],
       [1, 4, 4],
       [2, 0, 0],
       [3, 0, 0],
       [3, 2, 2],
       [3, 4, 4]])
于 2012-07-18T10:14:58.787 回答