stanford-nlp - 如何获得 CoreNLP Sentiment 的分数分布值？

Question

我已经在我的 ubuntu 实例上设置了 CoreNLP 服务器，它工作正常。我对 Sentiment 模块更感兴趣，目前我得到的是

{
sentimentValue: "2",
sentiment: "Neutral"
}

我需要的是分数分布值，如您在此处看到的：http: //nlp.stanford.edu:8080/sentiment/ rntnDemo.html

 "scoreDistr": [0.1685, 0.7187, 0.0903, 0.0157, 0.0068]

我错过了什么或如何获取此类数据？

谢谢

score 3 · Accepted Answer

您需要SentimentCoreAnnotations.SentimentAnnotatedTree.class从带注释的句子中获取树对象。RNNCoreAnnotations然后，您可以通过班级获得预测。我在下面编写了以下独立的演示代码，展示了如何获取 CoreNLP 情绪预测的每个标签的分数。

import java.util.Arrays;
import java.util.List;
import java.util.Properties;

import org.ejml.simple.SimpleMatrix;

import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations;
import edu.stanford.nlp.trees.Tree;
import edu.stanford.nlp.util.CoreMap;

public class DemoSentiment {
    public static void main(String[] args) {
        final List<String> texts = Arrays.asList("I am happy.", "This is a neutral sentence.", "I am very angry.");
        final Properties props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, parse, sentiment");
        final StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
        for (String text : texts) {
            final Annotation doc = new Annotation(text);
            pipeline.annotate(doc);
            for (CoreMap sentence : doc.get(CoreAnnotations.SentencesAnnotation.class)) {
                final Tree tree = sentence.get(SentimentCoreAnnotations.SentimentAnnotatedTree.class);
                final SimpleMatrix sm = RNNCoreAnnotations.getPredictions(tree);
                final String sentiment = sentence.get(SentimentCoreAnnotations.SentimentClass.class);
                System.out.println("sentence:  "+sentence);
                System.out.println("sentiment: "+sentiment);
                System.out.println("matrix:    "+sm);
            }
        }
    }
}

输出将与以下内容相似（一些浮点舍入错误或更新的模型可能会改变分数）。

对于第一句I am happy.，您可以看到情绪是，并且，在将矩阵解释为有序列表时，Positive返回矩阵中的最大值是，在第四个位置。0.618

第二个句子This is a neutral sentence.在中间的得分最高，在0.952，因此是Neutral情绪。

最后一句话有相应的Negative情感，最高分0.652在第二位。

sentence:  I am happy.
sentiment: Positive
matrix:    Type = dense , numRows = 5 , numCols = 1
0.016  
0.037  
0.132  
0.618  
0.196  

sentence:  This is a neutral sentence.
sentiment: Neutral
matrix:    Type = dense , numRows = 5 , numCols = 1
0.001  
0.007  
0.952  
0.039  
0.001  

sentence:  I am very angry.
sentiment: Negative
matrix:    Type = dense , numRows = 5 , numCols = 1
0.166  
0.652  
0.142  
0.028  
0.012

stanford-nlp - 如何获得 CoreNLP Sentiment 的分数分布值？

1 回答 1

Related

Reference