nlg - 如何使用 simplenlg 组合两个句子

Question

给定一组像“John has a cat”和“John has a dog”这样的句子将创建一个像“John has a cat and dog”这样的句子。

我可以使用 simplenlg 创建相同的内容吗？

score 0 · Accepted Answer

您询问的任务在自然语言生成 (NLG) 中称为聚合。虽然SimpleNLG确实通过其实现引擎支持聚合，但它不会直接聚合两个字符串，例如您的示例中的那些。

然而，可以使用句法解析器和 SimpleNLG 来执行此任务。我将首先解释如何使用 SimpleNLG 语法生成目标句子：

import simplenlg.framework.*;
import simplenlg.lexicon.*;
import simplenlg.realiser.english.*;
import simplenlg.phrasespec.*;
import simplenlg.features.*;

public class TestMain {

  public static void main(String[] args) throws Exception {
    Lexicon lexicon = Lexicon.getDefaultLexicon();
    NLGFactory nlgFactory = new NLGFactory(lexicon);
    Realiser realiser = new Realiser(lexicon);

    // Create the SPhraseSpec object (sentence phrase).
    SPhraseSpec p = nlgFactory.createClause();

    // Create a noun phrase and set it as the subject of your sentence
    NPPhraseSpec john = nlgFactory.createNounPhrase("John");
    p.setSubject(john);

    // Create a verb phrase and set it as the verb of your sentence
    VPPhraseSpec have = nlgFactory.createVerbPhrase("have");
    // Note that the verb is "have" not "has".  Have is the base lemma.
    // The morphology of this will be handled based on the tense you set (see below)
    p.setVerb(have);

    // Create a determiner 'a'
    NPPhraseSpec a = nlgFactory.createNounPhrase("a");

    // Create two more noun phrases

    // One for dog
    NPPhraseSpec cat = nlgFactory.createNounPhrase("cat");
    // set the determiner
    cat.setDeterminer(a);;

    // And one for cat.
    NPPhraseSpec dog = nlgFactory.createNounPhrase("dog");
    // set the determiner
    dog.setDeterminer(a);

    // Create a coordinated phrase
    // This tells SimpleNLG that these objects are a collection which should be aggregated
    CoordinatedPhraseElement coord = nlgFactory.createCoordinatedPhrase(cat, dog);

    // Set the coordinated phrase as the object of your sentence
    p.setObject(coord);

    // Print it - 
    String output = realiser.realiseSentence(p);
    System.out.println(output);
    // => John has a cat and a dog.

    // Now lets see what SimpleNLG can do!

    // Change the tense to past (present was the default)
    p.setTense(Tense.PAST);
    output = realiser.realiseSentence(p);
    System.out.println(output);
    // => John had a cat and a dog.

    // Change the tense to future
    p.setTense(Tense.FUTURE);
    output = realiser.realiseSentence(p);
    System.out.println(output);
    // => John will will have a cat and a dog.
  }
}

这就是您在 SimpleNLG 实现器中使用语言的方式。但是，它并没有回答您直接聚合两个字符串的问题。可能还有其他方法，但我的第一个想法是使用句法解析，例如StanfordNLP或spaCy。

我在自己的工作中使用 spaCy（这是一个 python 库）。我将展示一个简短的例子来说明我的意思。

import spacy

nlp = spacy.load('en_core_web_sm')
doc = nlp(u'John has a cat')

for token in doc:
    print(token.text, token.lemma_, token.pos_, token.tag_, token.dep_,
          token.shape_, token.is_alpha, token.is_stop)

这输出：

John john PROPN NNP nsubj Xxxx True False
has have VERB VBZ ROOT xxx True True
a a DET DT det x True True
cat cat NOUN NN dobj xxx True False

您可以从输出中看到句子中的每个标记都被标记为名词、动词、限定词等。您可以使用此信息来格式化 SimpleNLG 的输入，然后聚合您的句子。我建议 SimpleNLG 中可用的 XMLRealiser 比仅用 Java 编写语法要好。它将 XML 作为输入。

NLP/NLG 的工作并非微不足道。语言非常复杂。以上只是处理此类任务的一种方式。可能存在仅基于字符串聚合的工具，但 SimpleNLG 只是一个表面实现器，因此您必须以合适的格式将输入数据呈现给它，如上所示。

nlg - 如何使用 simplenlg 组合两个句子

1 回答 1

Related