2

我用于 POS 标记的 python 代码:

>>> import nltk, csv, itertools
>>> sentence = "Unigram taggers are based on a simple statistical algorithm: for each token, assign the tag that is most likely for that particular token."
>>> tokens = nltk.word_tokenize(sentence)
>>> tags = nltk.pos_tag(tokens)
and the error shown is:
>>> tags = nltk.pos_tag(tokens)
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    tags = nltk.pos_tag(tokens)
  File "/usr/local/lib/python2.7/dist-packages/nltk/tag/__init__.py", line 99, in pos_tag
    tagger = load(_POS_TAGGER)
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 605, in load
    resource_val = pickle.load(_open(resource_url))
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 686, in _open
    return find(path).open()
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 455, in find
    try: return find(modified_name)
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 445, in find
    try: return ZipFilePathPointer(p, zipentry)
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 311, in __init__
    zipfile = OpenOnDemandZipFile(os.path.abspath(zipfile))
  File "/usr/local/lib/python2.7/dist-packages/nltk/data.py", line 738, in __init__
    zipfile.ZipFile.__init__(self, filename)
  File "/usr/lib/python2.7/zipfile.py", line 714, in __init__
    self._GetContents()
  File "/usr/lib/python2.7/zipfile.py", line 748, in _GetContents
    self._RealGetContents()
  File "/usr/lib/python2.7/zipfile.py", line 763, in _RealGetContents
    raise BadZipfile, "File is not a zip file"
BadZipfile: File is not a zip file

是否包含任何 python 模块?

解决办法是什么?

4

1 回答 1

0

而不是使用 pos_tag

应用这个

nltk.download("maxent_treebank_pos_tagger")
nltk.download("maxent_ne_chunker")
nltk.download("punkt")

前两个用于 pos_tag,最后一个用于 sent_tokenizer

于 2020-11-27T15:31:26.523 回答