问题的原因
错误是因为您试图索引超出sediment_transport
网格边界(例如 i+1 和 j+1 部分)。现在,您正试图获得一个在网格边界时不存在的值。此外,它不会引发错误,但是当您处于 i=0 或 j=0 时(由于 i-1 和 j-1 部分),您当前正在抓住相反的边缘。
您提到您希望elevation_change
边界处的值为 0(这当然看起来很合理)。另一个常见的边界条件是“包装”这些值并从相对边缘获取一个值。在这种情况下它可能没有多大意义,但我将在几个示例中展示它,因为它很容易使用某些方法实现。
诱人但不正确
很容易捕获异常并将值设置为 0。例如:
for [i, j], flow in np.ndenumerate(flow_direction_np):
try:
if flow == 32:
...
elif ...
...
except IndexError:
elevation_change[i, j] = 0
但是,这种方法实际上是不正确的。负索引是有效的,并将返回网格的相反边缘。因此,这将基本上在网格的右侧和底部边缘实现“零”边界条件,在左侧和顶部边缘实现“环绕”边界条件。
用零填充
在“零”边界条件的情况下,有一种非常简单的方法可以避免索引问题:sediment_transport
用零填充网格。这样,如果我们索引超出原始网格的边缘,我们将得到一个 0。(或者你想用任何常量值填充数组。)
旁注:这是使用的理想场所numpy.pad
。但是,它是在 v1.7 中添加的。我将在这里跳过使用它,因为 OP 提到了 ArcGIS,并且 Arc 没有附带最新版本的 numpy。
例如:
padded_transport = np.zeros((rows + 2, cols + 2), float)
padded_transport[1:-1, 1:-1] = sediment_transport
# The two lines above could be replaced with:
#padded_transport = np.pad(sediment_transport, 1, mode='constant')
for [i, j], flow in np.ndenumerate(flow_direction):
# Need to take into account the offset in the "padded_transport"
r, c = i + 1, j + 1
if flow == 32:
elevation_change[i, j] = padded_transport[r - 1, c - 1]
elif flow == 16:
elevation_change[i, j] = padded_transport[r, c - 1]
elif flow == 8:
elevation_change[i, j] = padded_transport[r + 1, c - 1]
elif flow == 4:
elevation_change[i, j] = padded_transport[r + 1, c]
elif flow == 64:
elevation_change[i, j] = padded_transport[r - 1, c]
elif flow == 128:
elevation_change[i, j] = padded_transport[r - 1, c + 1]
elif flow == 1:
elevation_change[i, j] = padded_transport[r, c + 1]
elif flow == 2:
elevation_change[i, j] = padded_transport[r + 1, c + 1]
干燥(不要重复自己)
我们可以使用以下代码更紧凑地编写此代码dict
:
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
padded_transport = np.zeros((nrows + 2, ncols + 2), float)
padded_transport[1:-1, 1:-1] = sediment_transport
for [i, j], flow in np.ndenumerate(flow_direction):
# Need to take into account the offset in the "padded_transport"
r, c = i + 1, j + 1
# This also allows for flow_direction values not listed above...
dr, dc = lookup.get(flow, (0,0))
elevation_change[i,j] = padded_transport[r + dr, c + dc]
此时,对原始数组进行填充就有点多余了。如果使用numpy.pad
,通过填充实现不同的边界条件非常容易,但我们可以直接写出逻辑:
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
for [i, j], flow in np.ndenumerate(flow_direction):
dr, dc = lookup.get(flow, (0,0))
r, c = i + dr, j + dc
if not ((0 <= r < nrows) & (0 <= c < ncols)):
elevation_change[i,j] = 0
else:
elevation_change[i,j] = sediment_transport[r, c]
“向量化”计算
由于我不会在这里深入研究的原因,在 python 中遍历 numpy 数组相当慢。因此,在 numpy 中有更有效的方法来实现这一点。诀窍是numpy.roll
与布尔索引一起使用。
对于“环绕”边界条件,它很简单:
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
for value, (row, col) in lookup.iteritems():
mask = flow_direction == value
shifted = np.roll(mask, row, 0)
shifted = np.roll(shifted, col, 1)
elevation_change[mask] = sediment_transport[shifted]
return elevation_change
如果您不熟悉 numpy,这可能看起来有点像希腊语。这有两个部分。第一个是使用布尔索引。作为这样做的一个简单示例:
In [1]: import numpy as np
In [2]: x = np.arange(9).reshape(3,3)
In [3]: x
Out[3]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [4]: mask = np.array([[False, False, True],
... [True, False, False],
... [True, False, False]])
In [5]: x[mask]
Out[5]: array([2, 3, 6])
如您所见,如果我们使用相同形状的布尔网格对数组进行索引,则将返回其为 True 的值。同样,您可以通过这种方式设置值。
下一个技巧是numpy.roll
。这将使值在一个方向上移动给定的数量。它们会在边缘“环绕”。
In [1]: import numpy as np
In [2]: np.array([[0,0,0],[0,1,0],[0,0,0]])
Out[2]:
array([[0, 0, 0],
[0, 1, 0],
[0, 0, 0]])
In [3]: x = _
In [4]: np.roll(x, 1, axis=0)
Out[4]:
array([[0, 0, 0],
[0, 0, 0],
[0, 1, 0]])
In [5]: np.roll(x, 1, axis=1)
Out[5]:
array([[0, 0, 0],
[0, 0, 1],
[0, 0, 0]])
无论如何,希望这有点道理。
要实现“零”边界条件(或使用 的任意边界条件numpy.pad
),我们将执行以下操作:
def vectorized(flow_direction, sediment_transport):
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
# Initialize an array for the "shifted" mask
shifted = np.zeros((nrows+2, ncols+2), dtype=bool)
# Pad "sediment_transport" with zeros
# Again, `np.pad` would be better and more flexible here, as it would
# easily allow lots of different boundary conditions...
tmp = np.zeros((nrows+2, ncols+2), sediment_transport.dtype)
tmp[1:-1, 1:-1] = sediment_transport
sediment_transport = tmp
for value, (row, col) in lookup.iteritems():
mask = flow_direction == value
# Reset the "shifted" mask
shifted.fill(False)
shifted[1:-1, 1:-1] = mask
# Shift the mask by the right amount for the given value
shifted = np.roll(shifted, row, 0)
shifted = np.roll(shifted, col, 1)
# Set the values in elevation change to the offset value in sed_trans
elevation_change[mask] = sediment_transport[shifted]
return elevation_change
矢量化的优势
“矢量化”版本要快得多,但会使用更多 RAM。
对于 1000 x 1000 网格:
In [79]: %timeit vectorized(flow_direction, sediment_transport)
10 loops, best of 3: 170 ms per loop
In [80]: %timeit iterate(flow_direction, sediment_transport)
1 loops, best of 3: 5.36 s per loop
In [81]: %timeit lookup(flow_direction, sediment_transport)
1 loops, best of 3: 3.4 s per loop
这些结果来自将以下实现与随机生成的数据进行比较:
import numpy as np
def main():
# Generate some random flow_direction and sediment_transport data...
nrows, ncols = 1000, 1000
flow_direction = 2 ** np.random.randint(0, 8, (nrows, ncols))
sediment_transport = np.random.random((nrows, ncols))
# Make sure all of the results return the same thing...
test1 = vectorized(flow_direction, sediment_transport)
test2 = iterate(flow_direction, sediment_transport)
test3 = lookup(flow_direction, sediment_transport)
assert np.allclose(test1, test2)
assert np.allclose(test2, test3)
def vectorized(flow_direction, sediment_transport):
elevation_change = np.zeros_like(sediment_transport)
sediment_transport = np.pad(sediment_transport, 1, mode='constant')
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
for value, (row, col) in lookup.iteritems():
mask = flow_direction == value
shifted = np.pad(mask, 1, mode='constant')
shifted = np.roll(shifted, row, 0)
shifted = np.roll(shifted, col, 1)
elevation_change[mask] = sediment_transport[shifted]
return elevation_change
def iterate(flow_direction, sediment_transport):
elevation_change = np.zeros_like(sediment_transport)
padded_transport = np.pad(sediment_transport, 1, mode='constant')
for [i, j], flow in np.ndenumerate(flow_direction):
r, c = i + 1, j + 1
if flow == 32:
elevation_change[i, j] = padded_transport[r - 1, c - 1]
elif flow == 16:
elevation_change[i, j] = padded_transport[r, c - 1]
elif flow == 8:
elevation_change[i, j] = padded_transport[r + 1, c - 1]
elif flow == 4:
elevation_change[i, j] = padded_transport[r + 1, c]
elif flow == 64:
elevation_change[i, j] = padded_transport[r - 1, c]
elif flow == 128:
elevation_change[i, j] = padded_transport[r - 1, c + 1]
elif flow == 1:
elevation_change[i, j] = padded_transport[r, c + 1]
elif flow == 2:
elevation_change[i, j] = padded_transport[r + 1, c + 1]
return elevation_change
def lookup(flow_direction, sediment_transport):
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
for [i, j], flow in np.ndenumerate(flow_direction):
dr, dc = lookup.get(flow, (0,0))
r, c = i + dr, j + dc
if not ((0 <= r < nrows) & (0 <= c < ncols)):
elevation_change[i,j] = 0
else:
elevation_change[i,j] = sediment_transport[r, c]
return elevation_change
if __name__ == '__main__':
main()