python - 分箱数据和包含结果

Question

假设我已经在这样的结构中分箱了一些数据：

data = {(1,1): [...] # list of float,
        (1,2): [...],
        (1,3): [...],
        (2,1): [...],
        ... }

这里我只有两个轴用于分箱，但假设我有 N 个。现在假设例如我有 N=3 轴，我想要第二个 bin 为 1 的数据，所以我想要一个函数

(None, 1, None) -> [(1, 1, 1), (1, 1, 2), (1, 1, 3), ...
                    (2, 1, 1), (2, 1, 2), (2, 1, 3), ...]

所以我可以使用itertools.chain结果

您知道每个轴的范围：

axes_ranges = [(1, 10), (1, 8), (1, 3)]

其他例子：

(None, 1, 2) -> [(1, 1, 2), (2, 1, 2), (3, 1, 2), ...]
(None, None, None) -> all the combinations
(1,2,3) -> [(1,2,3)]

score 1 · Accepted Answer

看起来很像你重新发明轮子。您可能想要使用的是 numpy.ndarray：

    import numpy as np
    >>> x = np.arange(0,27)
    >>> x
    array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
    17, 18, 19, 20, 21, 22, 23, 24, 25, 26])
    >>> x.reshape(3,3,3)
    array([[[ 0,  1,  2],
    [ 3,  4,  5],
    [ 6,  7,  8]],

    [[ 9, 10, 11],
     [12, 13, 14],
     [15, 16, 17]],

    [[18, 19, 20],
     [21, 22, 23],
     [24, 25, 26]]])

    >>> x[0]
    array([[0, 1, 2],
    [3, 4, 5],
    [6, 7, 8]])
    >>> x[:,1,:]
    array([[ 3,  4,  5],
    [12, 13, 14],
    [21, 22, 23]])
    >>> x[:,1,1]
    array([ 4, 13, 22])

这可以有 N 个维度。在示例中，索引是三维的，您可以将其视为具有 x[a,b,c] = x[layer,row,column] 的立方体。使用“：”作为索引仅表示“全部”

score 1 · Accepted Answer

嗯，怎么样：

import itertools

def combinations_with_fixpoint(iterables, *args):
    return itertools.product(*([x] if x else y for x, y in zip(args, iterables)))


axes_ranges = [(1, 7), (1, 8), (77, 79)]

combs = combinations_with_fixpoint(
    itertools.starmap(range, axes_ranges),
    None, 5, None
)

for p in combs:
    print p

# (1, 5, 77)
# (1, 5, 78)
# (2, 5, 77)
# (2, 5, 78)
# (3, 5, 77)
# (3, 5, 78)
# (4, 5, 77)
# (4, 5, 78)
# (5, 5, 77)
# (5, 5, 78)
# (6, 5, 77)
# (6, 5, 78)

也许只是传递一个列表以允许多个“固定点”：

def combinations_with_fixpoint(iterables, *args):
    return itertools.product(*(x or y for x, y in zip(args, iterables)))

combs = combinations_with_fixpoint(
    itertools.starmap(range, axes_ranges),
    None, [5, 6], None
)

score 0 · Accepted Answer

binning = [[0, 0.1, 0.2], [0, 10, 20], [-1, -2, -3]]
range_binning = [(1, len(x) + 1) for x in binning]

def expand_bin(thebin):
    def expand_bin_index(thebin, freeindex, rangebin):
        """
        thebin = [1, None, 3]
        freeindex = 1
        rangebin = [4,5]
        -> [[1, 4, 3], [1, 5, 3]]
        """
        result = []
        for r in rangebin:
            newbin = thebin[:]
            newbin[freeindex] = r
            result.append(newbin)
        return result

    tmp = [thebin]
    indexes_free = [i for i,aa in enumerate(thebin) if aa is None]
    for index_free in indexes_free:
        range_index = range(*(range_binning[index_free]))
        new_tmp = []
        for t in tmp:
            for expanded in expand_bin_index(t, index_free, range_index):
                new_tmp.append(expanded)
        tmp = new_tmp
    return tmp

inputs = ([None, 1, 2], [None, None, 3], [None, 1, None], [3, 2, 1], [None, None, None])
for i in inputs:
    print "%s-> %s" % (i, expand_bin(i))

python - 分箱数据和包含结果

3 回答 3

Related

Reference