4

我的文件如下所示:

aaien 12 13 39
aan 10
aanbad 12 13 14 57 58 38
aanbaden 12 13 14 57 58 38
aanbeden 12 13 14 57 58 38
aanbid  12 13 14 57 58 39
aanbidden 12 13 14 57 58 39
aanbidt 12 13 14 57 58 39
aanblik 27 28
aanbreken 39
...

我想用key =这个词(比如'aaien')制作一个字典,值应该是它旁边的数字列表。所以它必须这样看: {'aaien': ['12, 13, 39'], 'aan': ['10']}

这段代码似乎不起作用。

document = open('LIWC_words.txt', 'r')
liwcwords = document.read()
dictliwc = {}
for line in liwcwords:
    k, v = line.strip().split(' ')
    answer[k.strip()] = v.strip()

liwcwords.close()

python给出了这个错误:

ValueError: need more than 1 value to unpack
4

2 回答 2

9

您正在将您的行拆分为一个单词列表,但只给它一个键和值。

这将起作用:

with open('LIWC_words.txt', 'r') as document:
    answer = {}
    for line in document:
        line = line.split()
        if not line:  # empty line?
            continue
        answer[line[0]] = line[1:]

请注意,您不需要给出.split()论点;没有参数,它会在空格上拆分并为您剥离结果。这使您不必显式调用.strip().

另一种方法是仅在第一个空格上拆分:

with open('LIWC_words.txt', 'r') as document:
    answer = {}
    for line in document:
        if line.strip():  # non-empty line?
            key, value = line.split(None, 1)  # None means 'all whitespace', the default
            answer[key] = value.split()

第二个参数.split()限制了拆分的数量,保证最多返回 2 个元素,从而可以将赋值中的值解包到keyand value

任何一种方法都会导致:

{'aaien': ['12', '13', '39'],
 'aan': ['10'],
 'aanbad': ['12', '13', '14', '57', '58', '38'],
 'aanbaden': ['12', '13', '14', '57', '58', '38'],
 'aanbeden': ['12', '13', '14', '57', '58', '38'],
 'aanbid': ['12', '13', '14', '57', '58', '39'],
 'aanbidden': ['12', '13', '14', '57', '58', '39'],
 'aanbidt': ['12', '13', '14', '57', '58', '39'],
 'aanblik': ['27', '28'],
 'aanbreken': ['39']}

如果您仍然只看到一个键和文件的其余部分作为(拆分)值,则您的输入文件可能使用了非标准行分隔符。通过将字符添加到模式中,打开具有通用行尾支持的文件:U

with open('LIWC_words.txt', 'rU') as document:
于 2013-01-24T16:19:40.827 回答
2
>liwcwords = document.read()  
>dictliwc = {}    
>for line in liwcwords:

你在这里迭代一个字符串,这不是你想要的。试试document.readlines()。这是另一种解决方案。

from pprint import pprint
with open('LIWC_words.txt') as fd:
    d = {}
    for i in fd:
        entry = i.split()
        if entry: d.update({entry[0]: entry[1:]})

pprint(d)

这是输出的样子

{'aaien': ['12', '13', '39'],
 'aan': ['10'],
 'aanbad': ['12', '13', '14', '57', '58', '38'],
 'aanbaden': ['12', '13', '14', '57', '58', '38'],
 'aanbeden': ['12', '13', '14', '57', '58', '38'],
 'aanbid': ['12', '13', '14', '57', '58', '39'],
 'aanbidden': ['12', '13', '14', '57', '58', '39'],
 'aanbidt': ['12', '13', '14', '57', '58', '39'],
 'aanblik': ['27', '28'],
 'aanbreken': ['39']}
于 2013-01-24T17:32:43.307 回答