1

我想将Link Grammar Python3 绑定用于简单的语法检查器。虽然链接 API 的文档相对完善,但似乎没有办法访问所有阻止链接的令牌。

这是我到目前为止所拥有的:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from linkgrammar import Sentence, ParseOptions, Dictionary, __version__
print('Link Grammar Version:', __version__)

for sentence in ['This is a valid sample sentence.', 'I Can Has Cheezburger?']:
    sent = Sentence(sentence, Dictionary(), ParseOptions())
    linkages = sent.parse()
    if len(linkages) > 0:
        print('Valid:', sentence)
    else:
        print('Invalid:', sentence)

(我使用 link-grammar-5.4.3 进行测试。)

当我使用 Link Parser 命令行工具分析无效的例句时,我得到以下输出:

linkparser> I Can Has Cheezburger?
No complete linkages found.
Found 1 linkage (1 had no P.P. violations) at null count 1
    Unique linkage, cost vector = (UNUSED=1 DIS= 0.10 LEN=7)

    +------------------Xp------------------+
    +------------->Wa--------------+       |
    |            +---G--+-----G----+       |
    |            |      |          |       |
LEFT-WALL [I] Can[!] Has[!] Cheezburger[!] ?

如何使用 Python3 获取所有标有 [!] 或 [?] 的潜在无效标记?

4

1 回答 1

1

看看它是如何完成的bindings/python-examples/sentence-check.py。最好看看最新的 repo 版本(当前版本在这里),因为这个演示程序在 5.4.3 有一个错误。

具体来说,以下提取单词列表:

words = list(linkage.words())

未链接的单词包含在[]. []附加到它们的单词是猜测的单词。例如,[!]表示该词已通过正则表达式(出现在文件中4.0.regex)进行分类,然后已在字典中查找此分类。如果将 parse-option 设置display_morphologyTrue,则分类正则表达式名称将出现在!.

这是单词输出格式的完整图例:

 [word]            Null-linked word
 word[!]           word classified by a regex
 word[!REGEX_NAME] word classified by REGEX_NAME (turn on by morphology=1)
 word[~]           word generated by a spell guess (unknown original word)
 word[&]           word run-on separated by a spell guess
 word[?]           word is unknown (looked up in the dict as UNKNOWN-WORD)
 word.POS          word found in the dictionary as word.POS
 word.#CORRECTION  word is probably a typo - got linked as CORRECTION

For dictionaries that support morphology (turn on by morphology=1):
 word=             A prefix morpheme
 =word             A suffix morpheme
 word.=            A stem

将输出单词与原始句子单词匹配可能很有用,尤其是在拼写更正或打开形态学的情况下。所述演示程序sentence-check.py在您调用它时会执行此操作-p- 请参阅if arg.position:.

在您的演示句子I Can Has Cheezburger?中,只有单词I没有链接,其他单词已被归类为大写单词并被链接为专有名词(G链接类型)。

您可以在summarise-links中找到有关链接类型的更多信息。

于 2018-04-07T15:03:15.943 回答