1

我在 python 3.x 中遇到了一个带有搁置的奇怪问题。概念是,引理的形态形式用于根据这些形式找到合适的范式,使用搁置。

guessexpl是搁置,ntrf是关键。问题是,我有 key: '\tk1gMnSc2[^:]*::a k1gMnPc1[^:]*:k:ci k1gMnPc6[^:]*:k:cích'(define morphological forms) 和ntrf in guessexplreturn True,但print(guessexpl[ntrf])会引发异常。

代码:

print('ntrf in expl',ntrf in guessexpl, 'ntrf in guesser',ntrf in guesser)
if ntrf in guessexpl:
    print('print guessexpl[ntrf]')
    print(guessexpl[ntrf])

修补shelve后使用 Python unpickler 而不是 C 版本:

import shelve
import pickle
# Use the Python implementation, not the C extension, to debug.
shelve.Unpickler = pickle._Unpickler

我得到的错误是:

ntrf in expl True ntrf in guesser True
print guessexpl[ntrf]
Traceback (most recent call last):
  File "/usr/lib/python3.2/shelve.py", line 111, in __getitem__
    value = self.cache[key]
KeyError: '\tk1gMnSc2[^:]*::a k1gMnPc1[^:]*:k:ci k1gMnPc6[^:]*:k:cích'

During handling of the above exception, another exception occurred:

Traceback (most recent cal
  File "./scripts/lntrf2lpn_editing.py", line 363, in <module>
    sys.exit(main(sys.argv))
  File "./scripts/lntrf2lpn_editing.py", line 311, in main
    print(guessexpl[ntrf])
  File "/usr/lib/python3.2/shelve.py", line 114, in __getitem__
    value = Unpickler(f).load()
  File "/usr/lib/python3.2/pickle.py", line 834, in load
    dispatch[key[0]](self)
  File "/usr/lib/python3.2/pickle.py", line 1158, in load_long_binget
    self.append(self.memo[i])
KeyError: -1493160213

可重现的样品

>>> a['\tk1gMnSc2[^:]*::a k1gMnPc1[^:]*:k:ci k1gMnPc6[^:]*:k:cích'][0]
{'Soustružník': [((184, (('Azték', 'nM'),)), {'Soustružník'})], 'soustružník': [((2260, (('vlk', ''),)), {'soustružník', 'kovosoustružník'})]}

>>> a['\tk1gMnSc2[^:]*::a k1gMnPc1[^:]*:k:ci k1gMnPc6[^:]*:k:cích'][1]
{'oustružník': [((2260, (('vlk', ''),)), {'soustružník', 'kovosoustružník'}), ((184, (('Azték', 'nM'),)), {'Soustružník'})], 'kolomazník': [((2260, (('vlk', ''),)), {'kolomazník'})], 'Kolomazník': [((184, (('Azték', 'nM'),)), {'Kolomazník'})]}

说明:ntrf 有 dict 列表。列表的要点是引理的长度。字典由引理的结尾组成:[(范式组的频率,(范式组)),相关引理的集合]...范式组由范式和音符组组成。

老实说,我不知道发生了什么。输入为 utf-8。如果搁置文件被删除,一切正常。但这不是解决方案。

4

0 回答 0