3

我正在使用 python 创建一个邪恶的刽子手游戏,但我被卡住了。我正试图弄清楚如何将单词放入家庭中。例如,假设我有一个列表

ALLY BETA COOL DEAL ELSE FLEW GOOD HOPE IBEX 

根据 E 的位置,每个单词都属于少数几个家族之一:

- - - -, containing ALLY, COOL, GOOD
- E - -, containing BETA and DEAL
- - E -, containing FLEW and IBEX
E - - E, containing ELSE
- - - E, containing HOPE.

有没有办法使用字典来帮助确定哪些单词属于哪些家庭?我们班还没有开始谈论字典,但我提前阅读并相信这是可能的。我使用的文件大约 170,000 字,但上面只是一个简单的例子。

4

4 回答 4

3
from itertools import groupby

words = ['ALLY', 'BETA', 'COOL', 'DEAL', 'ELSE', 'FLEW', 'GOOD', 'HOPE', 'IBEX']
e_locs = sorted(([c == 'E' for c in w], i) for i, w in enumerate(words))
result = [[words[i] for x, i in g] for k, g in groupby(e_locs, lambda x: x[0])]

结果:

>>> result
[['ALLY', 'COOL', 'GOOD'], ['HOPE'], ['FLEW', 'IBEX'], ['BETA', 'DEAL'], ['ELSE']]

这是一个版本,它还跟踪 Es 的位置:

words = ['ALLY', 'BETA', 'COOL', 'DEAL', 'ELSE', 'FLEW', 'GOOD', 'HOPE', 'IBEX']
result = {}
for word in words:
    key = ' '.join('E' if c == 'E' else '-' for c in word)
    if key not in result:
        result[key] = []
    result[key].append(word)

结果:

>>> pprint.pprint(result)
{'- - - -': ['ALLY', 'COOL', 'GOOD'],
 '- - - E': ['HOPE'],
 '- - E -': ['FLEW', 'IBEX'],
 '- E - -': ['BETA', 'DEAL'],
 'E - - E': ['ELSE']}

选择最大的家庭(使用第一个版本,result列表列表在哪里):

>>> max(result, key=len)
['ALLY', 'COOL', 'GOOD']

要使用第二个版本选择最大的家庭,您可以使用result.values()而不是result,或者获取包含 E 位置和家庭的元组,您可以使用以下内容:

>>> max(result.items(), key=lambda k_v: len(k_v[1]))
('- - - -', ['ALLY', 'COOL', 'GOOD'])
于 2013-04-10T21:52:00.047 回答
1
In [1]: from itertools import groupby

In [2]: import string

In [3]: words = "ALLY BETA COOL DEAL ELSE FLEW GOOD HOPE IBEX".split()

In [4]: table = string.maketrans('ABCDEFGHIJKLMNOPQRSTUVWXYZ',
   ...:                          '????E?????????????????????')

In [5]: f = lambda w: w.translate(table)

In [6]: for k,g in groupby(sorted(words, key=f), f):
   ...:     print k, list(g)
   ...:     
???? ['ALLY', 'COOL', 'GOOD']
???E ['HOPE']
??E? ['FLEW', 'IBEX']
?E?? ['BETA', 'DEAL']
E??E ['ELSE']

# to get the biggest group
In [7]: max((list(g) for _,g in groupby(sorted(words, key=f), f)), key=len)
Out[7]: ['ALLY', 'COOL', 'GOOD']
于 2013-04-10T22:10:43.673 回答
0

使用常规表达式,您可以执行以下操作:

import re

def into_families(words):
    # here you could add as many families as you want
    families = {
                '....': re.compile('[^E]{4}'),
                '...E': re.compile('[^E]{3}E'),
                '..E.': re.compile('[^E]{2}E[^E]'),
                '.E..': re.compile('[^E]E[^E]{2}'),
                'E..E': re.compile('E[^E]{2}E'),
    }
    return dict((k, [w for w in words if r.match(w)]) for k, r in families.items())

或者,如果您想动态创建正则表达式:

def into_families(words):
    family_names = set(''.join('E' if x == 'E' else '.' for x in w) for w in words)
    families = dict((x, re.compile(x.replace('.', '[^E]'))) for x in family_names)
    return dict((k, [w for w in words if r.match(w)]) for k, r in families.items())
于 2013-04-10T22:06:54.573 回答
0
from collections import defaultdict
import re

words = 'ALLY BETA COOL DEAL ELSE FLEW GOOD HOPE IBEX'.split()

groups = defaultdict(list)

for word in words:
    indices = tuple(m.start() for m in re.finditer('E', word))
    groups[indices].append(word)

for k, v in sorted(groups.items()):
    tpl = ['E' if i in k else'-' for i in range(4)]
    print ' '.join(tpl), ' '.join(v)
于 2013-04-10T22:19:43.213 回答