1

嗨,我正在使用 python,我正在尝试创建一个函数,让我生成由 2 个字母组成的单词。我还想计算生成的单词中有多少实际上在字典中。

这是我到目前为止所拥有的:

alphabet = ('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o',
            'p','q','r','s','t','u','v','w','x','y','z')
count1 = 0
text = " "

def find2LetterWords():
    for letter in alphabet:
        text += letter
        for letter in alphabet:
            text +=letter
    print text

这是我到目前为止写的代码,我知道它不对。我只是在做实验。所以是的,如果你能帮助我,那就太好了。谢谢。

4

5 回答 5

8

product模块中的itertools内容正是您生成所有可能的 2 字母单词列表所需要的。

from itertools import product

alphabet = ('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z')

two_letter_words = product(alphabet, alphabet)

for word in two_letter_words:
    print word

要比较字典中的哪一个,您需要从其他地方获取

于 2012-06-11T00:22:10.260 回答
5

另一种方式,使用列表理解:

words = [x+y for x in alphabet for y in alphabet]

或者不用自己输入字母:

from string import ascii_lowercase as a
words = [x+y for x in a for y in a]

让我们来比较一下 xvatar、Toote 和我的答案:

from itertools import product
from string import ascii_lowercase as a
import timeit

def nestedFor():
    w = []
    for l1 in a:
        for l2 in a:
            word = l1+l2
            w.append(word)
    return w

def nestedForIter():
    w = []
    for l1 in a:
        for l2 in a:
            yield l1+l2

def withProduct():
    return product(a,a)

def listComp():
    return [x+y for x in a for y in a]

def generatorComp():
    return (x+y for x in a for y in a)

# return list
t1 =  timeit.Timer(stmt="nestedFor()",
                   setup = "from __main__ import nestedFor")
t2 = timeit.Timer(stmt="list(withProduct())",
                   setup = "from __main__ import withProduct")
t3 = timeit.Timer(stmt="listComp()",
                   setup = "from __main__ import listComp")

# return iterator
t4 = timeit.Timer(stmt="nestedForIter()",
                   setup = "from __main__ import nestedForIter")
t5 = timeit.Timer(stmt="withProduct()",
                   setup = "from __main__ import withProduct")
t6 = timeit.Timer(stmt="generatorComp()",
                   setup = "from __main__ import generatorComp")

n = 100000

print 'Methods returning lists:'
print "Nested for loops:   %.3f" % t1.timeit(n)
print "list(product):      %.3f" % t2.timeit(n)
print "List comprehension: %.3f\n" % t3.timeit(n)

print 'Methods returning iterators:'
print "Nested for iterator:     %.3f" % t4.timeit(n)
print "itertools.product:       %.3f" % t5.timeit(n)
print "Generator comprehension: %.3f\n" % t6.timeit(n)

结果:

返回列表的方法:
嵌套 for 循环:13.362
列表(产品):
4.578 列表理解:7.231

返回生成器的方法:
为迭代器嵌套:0.045
itertools.product:0.212
生成器理解:0.066

换句话说,itertools.product如果您真的需要完整列表,请务必使用。但是,生成器速度更快,需要的内存更少,可能就足够了。

itertools.product 作为迭代器的相对缓慢是出乎意料的,考虑到文档说它相当于生成器表达式中的嵌套 for 循环。似乎有一些开销。

于 2012-06-11T00:26:35.163 回答
1
def find2LetterWords():
    words = []
    for first in alphabet:
        for second in alphabet:
            new_word = first + second
            words.append(new_word)
    print words
    return words
于 2012-06-11T00:24:13.307 回答
1

问题的第一部分已经得到很好的回答,但这是第二部分。

我还想计算生成的单词中有多少实际上在字典中。

其实这很容易。您知道您的单词列表中包含所有可能的组合。而且您知道字典键是唯一的;因此,两个字符长的键必须在单词列表中。您需要做的就是计算长度为 2 的键的数量。

counts = sum(len(k) == 2 for k in my_dict.iterkeys())
于 2012-06-11T00:36:44.130 回答
0

根据评论编辑的答案:

def find2LetterWords():
     #this generates all possible 2-letter combos with a list comprehension
     words = [first + second for second in alphabet for first in alphabet]
     #create a new list with only those words that are in your_dictionary (a list)
     real_words = [word for word in words if word in your_dictionary]
     return real_words

如果你想要一个漂亮的一个班轮,没有功能:

[word for word in [first + second for second in alphabet for first in alphabet] if word in your_dictionary]

显然,用your_dictionary您的字典名称替换。

于 2012-06-11T00:19:44.213 回答