java - 在斯坦福 CoreNLP 中添加新的注释器

Question

我正在尝试根据http://nlp.stanford.edu/downloads/corenlp.shtml中的说明在斯坦福 CoreNLP 中添加一个新的注释器。

“添加新的注释 StanfordCoreNLP器还可以通过反射添加新的注释器，而无需更改StanfordCoreNLP.java. Properties). 然后，将属性 customAnnotatorClass. 添加FOO=BAR到用于创建管道的属性中。如果随后将 FOO 添加到注释器列表中，则将创建类 BAR，并使用用于创建它的名称和传入的属性文件。”

我已经为我的新注释器创建了一个新类，但我无法放入将传入的属性文件。我只将新注释器放入管道中。

props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref, regexner, color");
props.setProperty("customAnnotatorClass.color", "myPackage.myPipeline");

有任何示例代码可以帮助我吗？

score 2 · Accepted Answer

如果你愿意，你可以拥有我的。有趣的东西开始于// adding our own annotator property：

/** Annotates a document with our customized pipeline.
 * @param text A text to process
 * @return The annotated text
 */
private Annotation annotateText(String text) {
    Annotation doc = new Annotation(text);

    StanfordCoreNLP pipeline;

    // creates a StanfordCoreNLP object, with POS tagging, lemmatization,
    // NER, parsing, and coreference resolution
    Properties props = new Properties();
    // alternative: wsj-bidirectional
    try {
        props.put(
                "pos.model",
                "edu/stanford/nlp/models/pos-tagger/wsj-bidirectional/wsj-0-18-bidirectional-distsim.tagger");
    } catch (Exception e) {
        e.printStackTrace();
    }
    // adding our own annotator property
    props.put("customAnnotatorClass.sdclassifier",
            "edu.kit.ipd.alicenlp.ivan.analyzers.StaticDynamicClassifier");

    // configure pipeline
    props.put(
                "annotators", 
                "tokenize, ssplit, pos, lemma, ner, parse, sdclassifier");
    pipeline = new StanfordCoreNLP(props);

    pipeline.annotate(doc);
    return doc;
}

java - 在斯坦福 CoreNLP 中添加新的注释器

1 回答 1

Related

Reference