正如@Tigran 所说,使用 Numpy 数组时切片不会产生任何成本。但是,通常我们可以使用来自slice.indices的信息将两个切片串联起来,这
假设长度为 length 的序列,从切片对象切片中检索 [s] 开始、停止和步长索引
我们可以减少
x[slice1][slice2]
至
x[combined]
第一个切片返回一个新对象,然后由第二个切片对其进行切片。因此,我们还需要数据对象的长度来正确组合切片。(第一维长度)
所以,我们可以写
def slice_combine(slice1, slice2, length):
"""
returns a slice that is a combination of the two slices.
As in
x[slice1][slice2]
becomes
combined_slice = slice_combine(slice1, slice2, len(x))
x[combined_slice]
:param slice1: The first slice
:param slice2: The second slice
:param length: The length of the first dimension of data being sliced. (eg len(x))
"""
# First get the step sizes of the two slices.
slice1_step = (slice1.step if slice1.step is not None else 1)
slice2_step = (slice2.step if slice2.step is not None else 1)
# The final step size
step = slice1_step * slice2_step
# Use slice1.indices to get the actual indices returned from slicing with slice1
slice1_indices = slice1.indices(length)
# We calculate the length of the first slice
slice1_length = (abs(slice1_indices[1] - slice1_indices[0]) - 1) // abs(slice1_indices[2])
# If we step in the same direction as the start,stop, we get at least one datapoint
if (slice1_indices[1] - slice1_indices[0]) * slice1_step > 0:
slice1_length += 1
else:
# Otherwise, The slice is zero length.
return slice(0,0,step)
# Use the length after the first slice to get the indices returned from a
# second slice starting at 0.
slice2_indices = slice2.indices(slice1_length)
# if the final range length = 0, return
if not (slice2_indices[1] - slice2_indices[0]) * slice2_step > 0:
return slice(0,0,step)
# We shift slice2_indices by the starting index in slice1 and the
# step size of slice1
start = slice1_indices[0] + slice2_indices[0] * slice1_step
stop = slice1_indices[0] + slice2_indices[1] * slice1_step
# slice.indices will return -1 as the stop index when slice.stop should be set to None.
if start > stop:
if stop < 0:
stop = None
return slice(start, stop, step)
然后,让我们运行一些测试
import sys
import numpy as np
# Make a 1D dataset
x = np.arange(100)
l = len(x)
# Make a (100, 10) dataset
x2 = np.arange(1000)
x2 = x2.reshape((100,10))
l2 = len(x2)
# Test indices and steps
indices = [None, -1000, -100, -99, -50, -10, -1, 0, 1, 10, 50, 99, 100, 1000]
steps = [-1000, -99, -50, -10, -3, -2, -1, 1, 2, 3, 10, 50, 99, 1000]
indices_l = len(indices)
steps_l = len(steps)
count = 0
total = 2 * indices_l**4 * steps_l**2
for i in range(indices_l):
for j in range(indices_l):
for k in range(steps_l):
for q in range(indices_l):
for r in range(indices_l):
for s in range(steps_l):
# Print the progress. There are a lot of combinations.
if count % 5197 == 0:
sys.stdout.write("\rPROGRESS: {0:,}/{1:,} ({2:.0f}%)".format(count, total, float(count) / float(total) * 100))
sys.stdout.flush()
slice1 = slice(indices[i], indices[j], steps[k])
slice2 = slice(indices[q], indices[r], steps[s])
combined = slice_combine(slice1, slice2, l)
combined2 = slice_combine(slice1, slice2, l2)
np.testing.assert_array_equal(x[slice1][slice2], x[combined],
err_msg="For 1D, slice1: {0},\tslice2: {1},\tcombined: {2}\tCOUNT: {3}".format(slice1, slice2, combined, count))
np.testing.assert_array_equal(x2[slice1][slice2], x2[combined2],
err_msg="For 2D, slice1: {0},\tslice2: {1},\tcombined: {2}\tCOUNT: {3}".format(slice1, slice2, combined2, count))
# 2 tests per loop
count += 2
print("\n-----------------")
print("All {0:,} tests passed!".format(count))
谢天谢地,我们得到
15,059,072 项测试全部通过!