nlp - 用于生成基于 NLP 的文本注释器的性能报告的实用程序

Question

我正在尝试为我的文本注释器构建质量测试框架。我使用GATE编写了我的注释器

我确实为每个输入文档提供了黄金标准（人工注释）数据。

这是用于质量保证的门资源列表GATE Embedded API for the measure

到目前为止，我能够获得包含FP,TP,FN, Precision, Recall and Fscores使用 AnnotationDiffer中的方法的性能矩阵

现在，我想深入研究。我想根据每个文档查看单个 FP、FN。即我想分析每个 FP 和 FN，以便我可以相应地修复我的注释器。

我没有在 GATE 的任何类中看到任何函数，例如返回List<Annotation>FP 或 FN 的 AnnotationDiffer。他们只返回 FP 和 FN 的计数

int fp=annotationDiffer.getFalsePositivesStrict()
int fn=annotationDiffer.getMissing()

在我继续创建自己的实用程序来获取List<Annotation>FP 和 FN 以及几个周围的句子之前，为每个输入文档创建一个 HTML 报告以进行分析。我想检查是否已经存在类似的东西。

score 2 · Accepted Answer

我想出了如何获取 FP 和 FN 注释

List<AnnotationDiffer.Pairing> differ= annotationDiffer.calculateDiff(goldAnnotSet, systemAnnotSet);


    for(Annotation fnAnnotation:annotationDiffer.missingAnnotations)
    {
       System.out.println("FN=>"+fnAnnotation);
    }


    for(Annotation fpAnnotation:annotationDiffer.spuriousAnnotations)
    {
       System.out.println("FP=>"+fpAnnotation);
    }

基于fnAnnotationor的偏移量fpAnnotations，我可以轻松获取周围的句子并创建一个漂亮的 html 报告。

nlp - 用于生成基于 NLP 的文本注释器的性能报告的实用程序

1 回答 1

Related

Reference