python - 将二维数组切片成更小的二维数组

Question

有没有办法将 numpy 中的二维数组分割成更小的二维数组？

例子

[[1,2,3,4],   ->    [[1,2] [3,4]   
 [5,6,7,8]]          [5,6] [7,8]]

所以我基本上想将一个 2x4 数组缩减为 2 个 2x2 数组。寻找用于图像的通用解决方案。

score 97 · Accepted Answer

几个月前还有一个问题reshape让我想到了使用and的想法swapaxes。这h//nrows是有道理的，因为这将第一个块的行保持在一起。你需要nrows并ncols成为形状的一部分也是有道理的。-1告诉 reshape 填写使 reshape 有效所需的任何数字。有了解决方案的形式，我只是尝试了一些东西，直到找到有效的公式。

您应该能够使用reshape和的某种组合将您的数组分成“块” swapaxes：

def blockshaped(arr, nrows, ncols):
    """
    Return an array of shape (n, nrows, ncols) where
    n * nrows * ncols = arr.size

    If arr is a 2D array, the returned array should look like n subblocks with
    each subblock preserving the "physical" layout of arr.
    """
    h, w = arr.shape
    assert h % nrows == 0, f"{h} rows is not evenly divisible by {nrows}"
    assert w % ncols == 0, f"{w} cols is not evenly divisible by {ncols}"
    return (arr.reshape(h//nrows, nrows, -1, ncols)
               .swapaxes(1,2)
               .reshape(-1, nrows, ncols))

转弯c

np.random.seed(365)
c = np.arange(24).reshape((4, 6))
print(c)

[out]:
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]

进入

print(blockshaped(c, 2, 3))

[out]:
[[[ 0  1  2]
  [ 6  7  8]]

 [[ 3  4  5]
  [ 9 10 11]]

 [[12 13 14]
  [18 19 20]]

 [[15 16 17]
  [21 22 23]]]

我在这里发布了一个反函数 ,unblockshaped和一个 N 维概括。概括给出了对该算法背后的推理的更多了解。

请注意，还有superbatfish 的 blockwise_view. 它以不同的格式排列块（使用更多轴），但它具有以下优点：（1）始终返回视图和（2）能够处理任何维度的数组。

score 8 · Accepted Answer

在我看来，这是一项任务numpy.split或某种变体。

例如

a = np.arange(30).reshape([5,6])  #a.shape = (5,6)
a1 = np.split(a,3,axis=1) 
#'a1' is a list of 3 arrays of shape (5,2)
a2 = np.split(a, [2,4])
#'a2' is a list of three arrays of shape (2,5), (2,5), (1,5)

如果您有 NxN 图像，您可以创建，例如，2 NxN/2 个子图像的列表，然后将它们沿另一个轴划分。

numpy.hsplit并且numpy.vsplit也可用。

score 7 · Accepted Answer

还有一些其他答案似乎已经非常适合您的具体情况，但是您的问题激起了我对内存高效解决方案的兴趣，该解决方案可用于 numpy 支持的最大维度数，我最终花费了大部分下午想出可能的方法。（方法本身比较简单，只是我还没有使用 numpy 支持的大部分真正花哨的功能，所以大部分时间都花在研究 numpy 有什么可用以及它可以做多少，所以我没有不必这样做。）

def blockgen(array, bpa):
    """Creates a generator that yields multidimensional blocks from the given
array(_like); bpa is an array_like consisting of the number of blocks per axis
(minimum of 1, must be a divisor of the corresponding axis size of array). As
the blocks are selected using normal numpy slicing, they will be views rather
than copies; this is good for very large multidimensional arrays that are being
blocked, and for very large blocks, but it also means that the result must be
copied if it is to be modified (unless modifying the original data as well is
intended)."""
    bpa = np.asarray(bpa) # in case bpa wasn't already an ndarray

    # parameter checking
    if array.ndim != bpa.size:         # bpa doesn't match array dimensionality
        raise ValueError("Size of bpa must be equal to the array dimensionality.")
    if (bpa.dtype != np.int            # bpa must be all integers
        or (bpa < 1).any()             # all values in bpa must be >= 1
        or (array.shape % bpa).any()): # % != 0 means not evenly divisible
        raise ValueError("bpa ({0}) must consist of nonzero positive integers "
                         "that evenly divide the corresponding array axis "
                         "size".format(bpa))


    # generate block edge indices
    rgen = (np.r_[:array.shape[i]+1:array.shape[i]//blk_n]
            for i, blk_n in enumerate(bpa))

    # build slice sequences for each axis (unfortunately broadcasting
    # can't be used to make the items easy to operate over
    c = [[np.s_[i:j] for i, j in zip(r[:-1], r[1:])] for r in rgen]

    # Now to get the blocks; this is slightly less efficient than it could be
    # because numpy doesn't like jagged arrays and I didn't feel like writing
    # a ufunc for it.
    for idxs in np.ndindex(*bpa):
        blockbounds = tuple(c[j][idxs[j]] for j in range(bpa.size))

        yield array[blockbounds]

score 3 · Accepted Answer

你的问题几乎和这个一样。您可以将单线与np.ndindex()and一起使用reshape()：

def cutter(a, r, c):
    lenr = a.shape[0]/r
    lenc = a.shape[1]/c
    np.array([a[i*r:(i+1)*r,j*c:(j+1)*c] for (i,j) in np.ndindex(lenr,lenc)]).reshape(lenr,lenc,r,c)

要创建您想要的结果：

a = np.arange(1,9).reshape(2,1)
#array([[1, 2, 3, 4],
#       [5, 6, 7, 8]])

cutter( a, 1, 2 )
#array([[[[1, 2]],
#        [[3, 4]]],
#       [[[5, 6]],
#        [[7, 8]]]])

score 3 · Accepted Answer

对 TheMeaningfulEngineer 的回答进行了一些小的改进，以处理大型二维数组无法完美分割成大小相等的子数组的情况

def blockfy(a, p, q):
    '''
    Divides array a into subarrays of size p-by-q
    p: block row size
    q: block column size
    '''
    m = a.shape[0]  #image row size
    n = a.shape[1]  #image column size

    # pad array with NaNs so it can be divided by p row-wise and by q column-wise
    bpr = ((m-1)//p + 1) #blocks per row
    bpc = ((n-1)//q + 1) #blocks per column
    M = p * bpr
    N = q * bpc

    A = np.nan* np.ones([M,N])
    A[:a.shape[0],:a.shape[1]] = a

    block_list = []
    previous_row = 0
    for row_block in range(bpc):
        previous_row = row_block * p   
        previous_column = 0
        for column_block in range(bpr):
            previous_column = column_block * q
            block = A[previous_row:previous_row+p, previous_column:previous_column+q]

            # remove nan columns and nan rows
            nan_cols = np.all(np.isnan(block), axis=0)
            block = block[:, ~nan_cols]
            nan_rows = np.all(np.isnan(block), axis=1)
            block = block[~nan_rows, :]

            ## append
            if block.size:
                block_list.append(block)

    return block_list

例子：

a = np.arange(25)
a = a.reshape((5,5))
out = blockfy(a, 2, 3)

a->
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

out[0] ->
array([[0., 1., 2.],
       [5., 6., 7.]])

out[1]->
array([[3., 4.],
       [8., 9.]])

out[-1]->
array([[23., 24.]])

score 2 · Accepted Answer

现在它只在大二维数组可以完美地分割成大小相等的子数组时才起作用。

下面的代码切片

a ->array([[ 0,  1,  2,  3,  4,  5],
           [ 6,  7,  8,  9, 10, 11],
           [12, 13, 14, 15, 16, 17],
           [18, 19, 20, 21, 22, 23]])

进入这个

block_array->
    array([[[ 0,  1,  2],
            [ 6,  7,  8]],

           [[ 3,  4,  5],
            [ 9, 10, 11]],

           [[12, 13, 14],
            [18, 19, 20]],

           [[15, 16, 17],
            [21, 22, 23]]])

pangq确定块大小

代码

a = arange(24)
a = a.reshape((4,6))
m = a.shape[0]  #image row size
n = a.shape[1]  #image column size

p = 2     #block row size
q = 3     #block column size

block_array = []
previous_row = 0
for row_block in range(blocks_per_row):
    previous_row = row_block * p   
    previous_column = 0
    for column_block in range(blocks_per_column):
        previous_column = column_block * q
        block = a[previous_row:previous_row+p,previous_column:previous_column+q]
        block_array.append(block)

block_array = array(block_array)

score 2 · Accepted Answer

如果您想要一个解决方案也可以处理矩阵不均分的情况，您可以使用以下方法：

from operator import add
half_split = np.array_split(input, 2)

res = map(lambda x: np.array_split(x, 2, axis=1), half_split)
res = reduce(add, res)

score 1 · Accepted Answer

这是一个基于 unutbu 回答的解决方案，用于处理矩阵不能均分的情况。在这种情况下，它会在使用一些插值之前调整矩阵的大小。为此，您需要 OpenCV。请注意，我必须交换ncols并nrows使其正常工作，但不知道为什么。

import numpy as np
import cv2
import math 

def blockshaped(arr, r_nbrs, c_nbrs, interp=cv2.INTER_LINEAR):
    """
    arr      a 2D array, typically an image
    r_nbrs   numbers of rows
    r_cols   numbers of cols
    """

    arr_h, arr_w = arr.shape

    size_w = int( math.floor(arr_w // c_nbrs) * c_nbrs )
    size_h = int( math.floor(arr_h // r_nbrs) * r_nbrs )

    if size_w != arr_w or size_h != arr_h:
        arr = cv2.resize(arr, (size_w, size_h), interpolation=interp)

    nrows = int(size_w // r_nbrs)
    ncols = int(size_h // c_nbrs)

    return (arr.reshape(r_nbrs, ncols, -1, nrows) 
               .swapaxes(1,2)
               .reshape(-1, ncols, nrows))

score 1 · Accepted Answer

a = np.random.randint(1, 9, size=(9,9))
out = [np.hsplit(x, 3) for x in np.vsplit(a,3)]
print(a)
print(out)

产量

[[7 6 2 4 4 2 5 2 3]
 [2 3 7 6 8 8 2 6 2]
 [4 1 3 1 3 8 1 3 7]
 [6 1 1 5 7 2 1 5 8]
 [8 8 7 6 6 1 8 8 4]
 [6 1 8 2 1 4 5 1 8]
 [7 3 4 2 5 6 1 2 7]
 [4 6 7 5 8 2 8 2 8]
 [6 6 5 5 6 1 2 6 4]]
[[array([[7, 6, 2],
       [2, 3, 7],
       [4, 1, 3]]), array([[4, 4, 2],
       [6, 8, 8],
       [1, 3, 8]]), array([[5, 2, 3],
       [2, 6, 2],
       [1, 3, 7]])], [array([[6, 1, 1],
       [8, 8, 7],
       [6, 1, 8]]), array([[5, 7, 2],
       [6, 6, 1],
       [2, 1, 4]]), array([[1, 5, 8],
       [8, 8, 4],
       [5, 1, 8]])], [array([[7, 3, 4],
       [4, 6, 7],
       [6, 6, 5]]), array([[2, 5, 6],
       [5, 8, 2],
       [5, 6, 1]]), array([[1, 2, 7],
       [8, 2, 8],
       [2, 6, 4]])]]

score 0 · Accepted Answer

添加到@Aenaon 答案和他的 blockfy 功能，如果您正在使用COLOR IMAGES/3D ARRAY，这是我为 3 通道输入创建 224 x 224 作物的管道

def blockfy(a, p, q):
'''
Divides array a into subarrays of size p-by-q
p: block row size
q: block column size
'''
m = a.shape[0]  #image row size
n = a.shape[1]  #image column size

# pad array with NaNs so it can be divided by p row-wise and by q column-wise
bpr = ((m-1)//p + 1) #blocks per row
bpc = ((n-1)//q + 1) #blocks per column
M = p * bpr
N = q * bpc

A = np.nan* np.ones([M,N])
A[:a.shape[0],:a.shape[1]] = a

block_list = []
previous_row = 0
for row_block in range(bpc):
    previous_row = row_block * p   
    previous_column = 0
    for column_block in range(bpr):
        previous_column = column_block * q
        block = A[previous_row:previous_row+p, previous_column:previous_column+q]

        # remove nan columns and nan rows
        nan_cols = np.all(np.isnan(block), axis=0)
        block = block[:, ~nan_cols]
        nan_rows = np.all(np.isnan(block), axis=1)
        block = block[~nan_rows, :]

        ## append
        if block.size:
            block_list.append(block)

return block_list

然后扩展到上面

for file in os.listdir(path_to_crop):   ### list files in your folder
   img = io.imread(path_to_crop + file, as_gray=False) ### open image 

   r = blockfy(img[:,:,0],224,224)  ### crop blocks of 224 x 224 for red channel
   g = blockfy(img[:,:,1],224,224)  ### crop blocks of 224 x 224 for green channel
   b = blockfy(img[:,:,2],224,224)  ### crop blocks of 224 x 224 for blue channel

   for x in range(0,len(r)):
       img = np.array((r[x],g[x],b[x])) ### combine each channel into one patch by patch

       img = img.astype(np.uint8) ### cast back to proper integers

       img_swap = img.swapaxes(0, 2) ### need to swap axes due to the way things were proceesed
       
       img_swap_2 = img_swap.swapaxes(0, 1) ### do it again

       Image.fromarray(img_swap_2).save(path_save_crop+str(x)+"bounding" + file,
                                        format = 'jpeg',
                                        subsampling=0,
                                        quality=100) ### save patch with new name etc

score 0 · Accepted Answer

我发布我的解决方案。请注意，此代码实际上并未创建原始数组的副本，因此它适用于大数据。此外，如果数组不能被平均划分，它也不会崩溃（但您可以通过删除ceil和检查是否v_slices和h_slices被划分而不休息来轻松地添加条件）。

import numpy as np
from math import ceil

a = np.arange(9).reshape(3, 3)

p, q = 2, 2
width, height = a.shape

v_slices = ceil(width / p)
h_slices = ceil(height / q)

for h in range(h_slices):
    for v in range(v_slices):
        block = a[h * p : h * p + p, v * q : v * q + q]
        # do something with a block

此代码更改（或更准确地说，使您可以直接访问数组的一部分）：

[[0 1 2]
 [3 4 5]
 [6 7 8]]

进入这个：

[[0 1]
 [3 4]]
[[2]
 [5]]
[[6 7]]
[[8]]

如果您需要实际副本，Aenaon 代码就是您要找的。

如果你确定大数组可以被平均分割，你可以使用numpy 分割工具。

python - 将二维数组切片成更小的二维数组

11 回答 11

Related

Reference