format - 将 rank-per-candidate 格式转换为 OpenSTV BLT 格式

Question

我最近使用问卷收集了一组关于各种软件组件重要性的意见。考虑到某种形式的 Condorcet 投票方法将是获得总排名的最佳方式，我选择使用 OpenSTV 对其进行分析。

我的数据采用表格格式，以空格分隔，看起来或多或少类似于：

A B C D E F G    # Candidates
5 2 4 3 7 6 1    # First ballot. G is ranked first, and E is ranked 7th
4 2 6 5 1 7 3    # Second ballot
etc

在这种格式中，数字表示排名，顺序表示候选。 每个“候选人”都有一个从 1 到 7 的等级（必需），其中 1 表示最重要，7 表示最不重要。不允许重复。

这种格式让我印象深刻，因为它是表示输出的最自然的方式，是选票格式的直接表示。

OpenSTV/BLT 格式使用不同的方法来表示相同的信息，概念上如下：

G B D C A F E    # Again, G is ranked first and E is ranked 7th
E B G A D C F    # 
etc

实际的数字文件格式使用候选的（基于 1 的）索引，而不是标签，因此更像：

7 2 4 3 1 6 5    # Same ballots as before.
5 2 7 1 4 3 6    # A -> 1, G -> 7

在这种格式中，数字表示候选，序列顺序表示排名。实际的、真实的 BLT 格式还包括一个前导权重和一个后面的零，以指示每张选票的结束，对此我不太关心。

我的问题是，从第一种格式转换为（数字）第二种格式的最优雅的方法是什么？

score 0 · Accepted Answer

这是我在 Python 中的解决方案，它工作正常，但感觉有点笨拙。我确定有一种更清洁的方式（也许是另一种语言？）

这比昨天下午花费的时间更长，所以也许其他人也可以使用它。

鉴于：

ballot = '5 2 4 3 7 6 1'

Python one(ish)-liner 来转换它：

rank = [i for r,i in sorted((int(r),i+1) for i,r in enumerate(ballot.split())]
rank = " ".join(rank)

或者，以更易于理解的形式：

# Split into a list and convert to integers
int_ballot = [int(x) for x in ballot.split()]

# This is the important bit.
# enumerate(int_ballot) yields pairs of (zero-based-candidate-index, rank)
# Use a list comprehension to swap to (rank, one-based-candidate-index)
ranked_ballot = [(rank,index+1) for index,rank in enumerate(int_ballot)]

# Sort by the ranking. Python sorts tuples in lexicographic order
# (ie sorts on first element)
# Use a comprehension to extract the candidate from each pair
rank = " ".join([candidate for rank,candidate in sorted(ranked_ballot)])

format - 将 rank-per-candidate 格式转换为 OpenSTV BLT 格式

1 回答 1

Related

Reference