python - 如何在 PyTorch 中检索多行的 topk 值及其各自的索引？

Question

我有一个这样的张量：

tensor([[[ 7.3478, -1.8058, -2.6140,  ..., -0.2719, -0.3171, -0.4737]],

    [[ 7.3606, -1.8269, -1.9825,  ..., -0.8680,  0.4894,  0.2708]]],
   grad_fn=<CatBackward>)

我想获得topk两行的值。目前我能做的是以下几点：

ipdb>  stacked.topk(2)
torch.return_types.topk(
values=tensor([[[14.3902, 14.3039]],

        [[14.8927, 12.1973]]], grad_fn=<TopkBackward>),
indices=tensor([[[60, 12]],

        [[12, 23]]]))

从输出中，您可以看到前 2 个值是从两行中检索到的。我想得到如下输出：

14.8927 that maps to index 12
14.3902 that maps to index 60

请注意，如果前 2 个值在第一行中，它只会从那里返回值并完全忽略第二行，反之亦然。

在这方面需要帮助。

以下是我想说的一种非常老套的方法，但它非常老套，并且显示为 BEAM_WIDTH 为 2：

BEAM_WIDTH = 2
top_k = stacked.data.topk(BEAM_WIDTH, dim=2)

v1, i1 = top_k[0][0][0], top_k[1][0][0]
v2, i2 = top_k[0][1][0], top_k[1][1][0]

i = j = 0
final = []
for _ in range(BEAM_WIDTH):
    if v1[i] >= v2[j]:
        final.append((v1[i], i1[i]))
        i += 1
    else:
        final.append((v2[j], i2[j]))
        j += 1

score 1 · Accepted Answer

重复指数

我相信这就是你想要的。首先，您会在展平列表中找到 topk 元素，然后将这些索引转换回相对于行的格式。

topk_values, linear_indices = stacked.flatten().topk(2)
topk_indices = linear_indices % stacked.shape[-1]

唯一指数

The previous approach doesn't enforce unique indices. If unique indices are needed then you could find the max between rows, then find the topk among that.

topk_values, topk_indices = stacked.max(dim=0)[0].flatten().topk(2)

Example

To demonstrate the difference between these two approaches, suppose you have

stacked = torch.tensor([[[11,8,0]],
                        [[10,9,0]]])

In the repeated indices case you would end up with

topk_values=[11, 10]
topk_indices=[0, 0]

In the unique indices case you would get

topk_values=[11, 9]
topk_indices=[0, 1]

python - 如何在 PyTorch 中检索多行的 topk 值及其各自的索引？

1 回答 1

Related

Reference