uima - Uima ruta - 缩写

Question

我可以使用 Uima Ruta 分割单词的字母吗？

前任。

1.(WHO)
2.(APIAs)

脚本：

DECLARE NEW;
BLOCK (foreach)CAP{}
{
W{REGEXP(".")->MARK(NEW)};

}

score 1 · Accepted Answer

是的，这是通过UIMA Ruta 中的简单正则表达式规则实现的：

DECLARE Char;
CAP->{"."->Char;};

您不能为此使用普通规则，因为您需要匹配比 RutaBasic 更小的东西。唯一的选择是使用直接作用于文本而不是注释的正则表达式规则。你当然应该非常小心，因为这会导致很多注释。

对有些紧凑的规则的一些解释：CAP->{"."->Char;};

CAP // the only rule element of the rule: match on each CAP annotation
->{// indicates that inlined rules follow that are applied in the context of the matched annotation.
"." // a regular expression matching on each character
-> Char // the "action" of the regex rule: create an annotation of the type Char for each match of the regex 
;}; // end of regex rule, end of inlined rules, end of actual rule

总而言之，该规则迭代所有 CAP 注释，在每个迭代的覆盖文本上应用正则表达式，并为匹配项创建注释。

当然，您也可以使用 BLOCK 代替内联规则。

免责声明：我是 UIMA Ruta 的开发人员

uima - Uima ruta - 缩写

1 回答 1

Related

Reference