uima - 一个简单的 Ruta 注释器

Question

我刚从 Ruta 开始，我想写一个这样的规则：

它会尝试匹配一个单词，例如 XYZ，当它碰到它时，它会将之前出现的文本分配给 Annotator CompanyDetails。

例如：

这是一个包含我们感兴趣的短语的段落，它位于句子之后。LL, Inc. 特拉华州有限责任公司 (XYZ)。

运行脚本后，注释器 CompanyDetails 将包含以下字符串：LL, Inc. a Delaware limited liability company

score 0 · Accepted Answer

当您谈论注释器“CompanyDetails”时，我假设您的意思是“CompanyDetails”类型的注释。

有很多（真的很多）不同的方法来解决这个任务。这是一个应用一些辅助规则的示例：

DECLARE Annotation CompanyDetails (STRING context);
DECLARE Sentence, XYZ;

// just to get a running example with simple sentences
PERIOD #{-> Sentence} PERIOD;
#{-> Sentence} PERIOD;
"XYZ" -> XYZ; // should be done in a dictionary

// the actual rule
STRING s;
Sentence{-> MATCHEDTEXT(s)}->{XYZ{-> CREATE(CompanyDetails, "context" = s)};};

此示例将完整句子的字符串存储在特征中。该规则匹配所有句子并将覆盖的文本存储在变量“s”中。然后，调查句子的内容：内联规则尝试在 XYZ 上匹配，创建类型的注释CompanyDetails，并将变量的值分配给名为的特征context。我宁愿存储注释而不是字符串，因为您仍然可以使用 getCoveredText() 获取字符串。如果您只需要句子中 XYZ 之前的标记，您可以执行类似的操作（这次使用注释而不是字符串）：

DECLARE Annotation CompanyDetails (Annotation context);
DECLARE Sentence, XYZ, Context;

// just to get a running example with simple sentences
PERIOD #{-> Sentence} PERIOD;
#{-> Sentence} PERIOD;
"XYZ" -> XYZ;

// the actual rule
Sentence->{ #{-> Context} SPECIAL? @XYZ{-> GATHER(CompanyDetails, "context" = 1)};};

uima - 一个简单的 Ruta 注释器

1 回答 1

Related

Reference