python - 这是在字符串中查找最长单词（复数）的有效方法吗？

Question

我是 Python 新手，发现了一些关于在字符串中查找最长 WORD 的建议，但没有一个可以解释包含多个匹配最长长度的单词的字符串。

玩了一圈之后，我决定了：

inputsentence = raw_input("Write a sentence: ").split()
longestwords = []
for word in inputsentence:
    if len(word) == len(max(inputsentence, key=len)):
        longestwords.append(word)

这样我就有了一个最长的单词列表，我可以用它来做某事。有没有更好的方法来做到这一点？

注意：假设inputsentence不包含整数或标点符号，只是一系列单词。

score 3 · Accepted Answer

如果您只使用少量文本执行此操作，则无需担心运行时效率：在编码、审查和调试方面，编程效率 要重要得多。因此，您拥有的解决方案很好，因为它对于数千个单词来说都是清晰且足够有效的。（但是，您应该在循环之前只计算一次。）len(max(inputsentence, key=len))for

但是假设你确实想用一个可能有几 GB 长的大型语料库来做这件事？这是一次完成的方法，无需将每个单词都存储在内存中（请注意，这inputcorpus可能是一个迭代器或分阶段读取语料库的函数）：仅保存所有最长的单词。如果你看到一个比当前最大值更长的单词，它显然是这个长度的第一个单词，所以你可以开始一个新的列表。

maxlength = 0
maxwords = [ ]  # unnecessary: will be re-initialized below
for word in inputcorpus:
    if len(word) > maxlength:
        maxlength = len(word)
        maxwords = [ word ]
    elif len(word) == maxlength:
        maxwords.append(word)

如果某个最大长度的单词重复，你最终会得到几个副本。为避免这种情况，只需使用set( )而不是列表（并调整初始化和扩展）。

score 1 · Accepted Answer

这个怎么样：

from itertools import groupby as gb

inputsentence = raw_input("Write a sentence: ").split() 

lwords = list(next(gb(sorted(inputsentence, key=len, reverse=True), key=len))[1])

score 0 · Accepted Answer

以defaultdict长度为key，修改如下：

words = inputsentence.split()
from collections import defaultdict
dd = defaultdict(list)
for word in words:
    dd[len(word)].append(word)

key_by_len = sorted(dd)
print dd[key_by_len[0]]

score 0 · Accepted Answer

0

希望这有帮助：

print max(raw_input().split(), key=len)

于 2014-06-12T09:49:05.013 回答

python - 这是在字符串中查找最长单词（复数）的有效方法吗？

4 回答 4

Related

Reference