我正在 xQuery 中编写一个 Porter 词干分析器,作为第一步,我需要匹配辅音和元音模式。我用作基础的 Perl 示例中的辅音匹配序列是(?:[^aiueoy]|(?:(?<=[aiueo])y)|\by)
,元音序列是(?:[aiueo]|(?:(?<![aiueo])y))
。我需要扩展它以包括字母 aesc (æ),所以这就是我的 xquery 正则表达式:
let $v := element {"vowels"} {matches($f,"(?:([^aiueoy])|(?:(?:[aiueo]\1)y))")}
let $c := element {"consonants"} {matches($f,"(?:([aiueo])|(?:(?<![aiueo]\1)y))")}
我正在寻找的 XML 类型示例如下:
<entry ref="173">
<headword>abǒve</headword>
<headword>abǒven</headword>
<variant>abufe</variant>
<variant>abufen</variant>
<variant>abuue</variant>
<variant>abuuen</variant>
<variant>abowve</variant>
<variant>obove</variant>
<variant>oboven</variant>
<variant>obufe</variant>
<variant>obufen</variant>
<variant>abof</variant>
<variant>obof</variant>
<variant>aboyf</variant>
<variant>aboun</variant>
<variant>aboune</variant>
<variant>abown</variant>
<variant>abowne</variant>
<variant>aboon</variant>
<variant>oboun</variant>
<variant>oboune</variant>
<variant>abow</variant>
<variant>aboʒe</variant>
<part_of_speech> adv. </part_of_speech>
</entry>
但是,在撒克逊人中运行它,我收到以下错误:Query failed with dynamic error: Syntax error at char 17 in regular expression: No expression before quantifier
我很确定我的问题是我没有正确构建积极的后视,已将其从 更改<=
为\1
,但我不确定我将如何构建该方面以一种适用于 xQuery 的方式。任何建议将不胜感激。