5

有谁知道如何解决这个文件读取错误,TreeTagger因为它是一种常见的自然语言处理工具,用于POS标记、词形还原和分块句子?

alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english 
        reading parameters ...

ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par
aborted.

正如http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt所暗示的,我没有遇到任何可能的安装问题。我已按照网页上的说明进行操作,并且已正确安装(http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/#Linux):

alvas@ikoma:~$ mkdir treetagger
alvas@ikoma:~$ cd treetagger
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tree-tagger-linux-3.2.tar.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tagger-scripts.tar.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/install-tagger.sh
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/dutch-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/german-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/italian-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/spanish-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/french-par-linux-3.2-utf8.bin.gz

alvas@ikoma:~/treetagger$ sh install-tagger.sh 

Linux version of TreeTagger installed.
Tagging scripts installed.
German parameter file (Linux, UTF8) installed.
German chunker parameter file (Linux) installed.
French parameter file (Linux, UTF8) installed.
French chunker parameter file (Linux, UTF8) installed.
Italian parameter file (Linux, UTF8) installed.
Spanish parameter file (Linux, UTF8) installed.
Dutch parameter file (Linux, UTF8) installed.
Path variables modified in tagging scripts.

You might want to add /home/alvas/treetagger/cmd and /home/alvas/treetagger/bin to the PATH variable so that you do not need to specify the full path to run the tagging scripts.

但是当我尝试测试软件时,我得到了这些错误:

alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english 
    reading parameters ...

ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par
aborted.
alvas@ikoma:~/treetagger$ echo 'Das ist ein Test.' | cmd/tagger-chunker-german

ERROR: Can't open for reading: /home/alvas/treetagger/lib/german-chunker.par
aborted.

ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par
aborted.
    reading parameters ...

ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par
aborted.
4

3 回答 3

6

我认为有两个问题:首先,脚本名称中应该有“-utf8”,例如cmd/tagger-chunker-german-utf8,因为您下载了 UTF-8 数据。其次,标记和分块都需要一个数据文件。请参阅主页,其中包含“PC 的参数文件”和“PC 的 Chunker 参数文件”部分 - 从这两个部分下载文件,然后重新执行install-tagger.sh

于 2013-03-19T19:06:40.203 回答
0

你写了cmd /tree-tagger-english,但我认为正确的路径(有参数文件的地方)是:

lib /tree-tagger-english

于 2015-11-03T10:38:30.380 回答
0

我有同样的问题。我意识到我为我需要的语言下载的 .par 文件没有被提取(它们仍在 .gz 中)。

确保先将它们解压缩到目录中,然后重试。

于 2019-03-22T10:21:19.133 回答