9

有没有一种智能且简单的方法可以将两个切片操作合二为一?

说我有类似的东西

arange(1000)[::2][10:20]
>>> array([20, 22, 24, 26, 28, 30, 32, 34, 36, 38])

当然在这个例子中这不是问题,但是如果数组非常大,我非常希望避免创建中间数组(或者没有?)。我相信应该可以将这两片结合起来,但也许我正在监督一些事情。所以这个想法是这样的:

arange(1000)[ slice(None,None,2) + slice(10,20,None) ]

这当然行不通,但这是我想做的。有什么东西可以结合切片对象吗?(尽管我很努力,但我没有找到任何东西)。

4

5 回答 5

5
  1. 您可以子类slice化以使切片的这种叠加成为可能。只需覆盖__add__(或__mul__- 数学家肯定会更喜欢*叠加符号)。但它会调用一些数学。顺便说一句,你可以用这些东西制作一个不错的 Python 包;-)
  2. 正如bheklilr所说,切片在 NumPy 中没有任何成本。所以你可以继续使用一个简单的解决方案,比如切片列表。

PS一般来说,可以使用多重切片来使代码更漂亮、更清晰。即使是以下行之一之间的简单选择:

v = A[::2][10:20]
v = A[20:40][::2]
v = A[20:40:2]

可以深度反映程序逻辑,使代码自文档化。

再举一个例子:如果你有一个平面 NumPy 数组并且你希望在positionlength位置提取一个子数组length,你可以这样做

v = A[position : position + length]

或者

v = A[position:][:length]

自己决定哪个选项看起来更好。;-)

于 2014-04-02T20:07:12.237 回答
1

正如@Tigran 所说,使用 Numpy 数组时切片不会产生任何成本。但是,通常我们可以使用来自slice.indices的信息将两个切片串联起来,这

假设长度为 length 的序列,从切片对象切片中检索 [s] 开始、停止和步长索引

我们可以减少

x[slice1][slice2]

x[combined]

第一个切片返回一个新对象,然后由第二个切片对其进行切片。因此,我们还需要数据对象的长度来正确组合切片。(第一维长度)

所以,我们可以写

def slice_combine(slice1, slice2, length):
    """
    returns a slice that is a combination of the two slices.
    As in 
      x[slice1][slice2]
    becomes
      combined_slice = slice_combine(slice1, slice2, len(x))
      x[combined_slice]

    :param slice1: The first slice
    :param slice2: The second slice
    :param length: The length of the first dimension of data being sliced. (eg len(x))
    """

    # First get the step sizes of the two slices.
    slice1_step = (slice1.step if slice1.step is not None else 1)
    slice2_step = (slice2.step if slice2.step is not None else 1)

    # The final step size
    step = slice1_step * slice2_step

    # Use slice1.indices to get the actual indices returned from slicing with slice1
    slice1_indices = slice1.indices(length)

    # We calculate the length of the first slice
    slice1_length = (abs(slice1_indices[1] - slice1_indices[0]) - 1) // abs(slice1_indices[2])

    # If we step in the same direction as the start,stop, we get at least one datapoint
    if (slice1_indices[1] - slice1_indices[0]) * slice1_step > 0:
        slice1_length += 1
    else:
        # Otherwise, The slice is zero length.
        return slice(0,0,step)

    # Use the length after the first slice to get the indices returned from a
    # second slice starting at 0.
    slice2_indices = slice2.indices(slice1_length)

    # if the final range length = 0, return
    if not (slice2_indices[1] - slice2_indices[0]) * slice2_step > 0:
        return slice(0,0,step)

    # We shift slice2_indices by the starting index in slice1 and the 
    # step size of slice1
    start = slice1_indices[0] + slice2_indices[0] * slice1_step
    stop = slice1_indices[0] + slice2_indices[1] * slice1_step

    # slice.indices will return -1 as the stop index when slice.stop should be set to None.
    if start > stop:
        if stop < 0:
            stop = None

    return slice(start, stop, step)

然后,让我们运行一些测试

import sys
import numpy as np

# Make a 1D dataset
x = np.arange(100)
l = len(x)

# Make a (100, 10) dataset
x2 = np.arange(1000)
x2 = x2.reshape((100,10))
l2 = len(x2)

# Test indices and steps
indices = [None, -1000, -100, -99, -50, -10, -1, 0, 1, 10, 50, 99, 100, 1000]
steps = [-1000, -99, -50, -10, -3, -2, -1, 1, 2, 3, 10, 50, 99, 1000]
indices_l = len(indices)
steps_l = len(steps)

count = 0
total = 2 * indices_l**4 * steps_l**2
for i in range(indices_l):
    for j in range(indices_l):
        for k in range(steps_l):
            for q in range(indices_l):
                for r in range(indices_l):
                    for s in range(steps_l):
                        # Print the progress. There are a lot of combinations.
                        if count % 5197 == 0:
                            sys.stdout.write("\rPROGRESS: {0:,}/{1:,} ({2:.0f}%)".format(count, total, float(count) / float(total) * 100))
                            sys.stdout.flush()

                        slice1 = slice(indices[i], indices[j], steps[k])
                        slice2 = slice(indices[q], indices[r], steps[s])

                        combined = slice_combine(slice1, slice2, l)
                        combined2 = slice_combine(slice1, slice2, l2)
                        np.testing.assert_array_equal(x[slice1][slice2], x[combined], 
                            err_msg="For 1D, slice1: {0},\tslice2: {1},\tcombined: {2}\tCOUNT: {3}".format(slice1, slice2, combined, count))
                        np.testing.assert_array_equal(x2[slice1][slice2], x2[combined2], 
                            err_msg="For 2D, slice1: {0},\tslice2: {1},\tcombined: {2}\tCOUNT: {3}".format(slice1, slice2, combined2, count))

                        # 2 tests per loop
                        count += 2

print("\n-----------------")
print("All {0:,} tests passed!".format(count))

谢天谢地,我们得到

15,059,072 项测试全部通过!

于 2014-11-06T15:23:38.893 回答
1

在 Python 3 中,内置的范围对象可以为您进行计算,而无需扩展以填充内存:

def combine_slices(length, *slices):
    r = range(length)   # length of array being sliced
    for s in slices:
        r = r[s]
    return slice(r.start, r.stop, r.step)

arr = range(-2**48, 2**48)   # simulate a huge array
s = combine_slices(len(arr), slice(2**48,None), slice(None,None,2), slice(10,20,None))

print(arr[s] == arr[2**48:][::2][10:20])    # => True
print(list(arr[s]))     # => [20, 22, 24, 26, 28, 30, 32, 34, 36, 38]
print(s)    # => slice(281474976710676, 281474976710696, 2)
于 2021-09-25T06:30:21.527 回答
0

您可以使用islice,它可能不会更快,但会通过作为生成器工作来避免中间条目:

arange = range(1000)

from itertools import islice
islice(islice(arange, None, None, 2), 10, 20)

%timeit list(islice(islice(arange, None, None, 2), 10, 20))
100000 loops, best of 3: 2 us per loop

%timeit arange[::2][10:20]
100000 loops, best of 3: 2.64 us per loop

所以,快一点。

于 2013-10-08T20:32:42.387 回答
-1

very simple:

arange(1000)[20:40:2]

should do

于 2013-10-08T20:19:09.280 回答