0

Python初学者在这里。

我目前正在处理差异。我正在使用google python library生成它们。

以下是 diff 结果的示例:

[(0, 'Ok.  I just '),
 (-1, 'need to write '),
 (0, 'out a random bunch of text\nand then keep going.  I'),
 (-1, ' just'),
 (0,
  " did an enter to see how that goes and all\nthe rest of it.  I know I need.  Here's a skipped line.\n\nThen there is more and "),
 (-1, 't'),
 (0, 'hen there was the thing.')]

这是一个元组列表。每个元组中的第一个元素是运算符(0 - 不变,-1 = 删除,1 = 添加)。第二个元素是从文本块中添加或删除的数据。

我想总结这些差异结果,以便读者可以通过阅读几行来了解更改的要点,而无需阅读可能只有 30 个字符左右更改的整个文本。

我的第一步是按字符长度对元组进行排名,然后显示前 3 个最大的变化(按照它们的原始顺序,两边都有一些未更改的文本)。

你认为我应该如何按字符长度对元组进行排序,抓住最长的三个,然后重新排列它们,使顺序与原来的一样?

理想情况下,结果将如下所示(使用上面的示例):

...只需要写出一个... ...我刚刚输入...

4

1 回答 1

1
input = [(0, 'Ok.  I just '), (-1, 'need to write '), (0, 'out a random bunch of text\nand then keep going.  I'), (-1, ' just'), (0, " did an enter to see how that goes and all\nthe rest of it.  I know I need.  Here's a skipped line.\n\nThen there is more and "), (-1, 't'), (0, 'hen there was the thing.')]

top_3 = [filtered_change[1] for filtered_change in sorted(sorted(enumerate(input), key=lambda change: len(change[1][1]), reverse=True)[:3])]

或者,一步一步:

indexed_changes = enumerate(input)
indexed_and_sorted_by_length = sorted(indexed_changes, key=lambda change: len(change[1][1]), reverse=True)
largest_3_indexed_changes = indexed_and_sorted_by_length[:3]
largest_3_indexed_sorted_by_index = sorted(largest_3_indexed_changes)
largest_3_changes_in_original_order = [indexed_change[1] for indexed_change in largest_3_indexed_sorted_by_index]
于 2012-07-18T15:50:39.113 回答