regex - 正则表达式 - 如何搜索单词的单数或复数版本

Question

我正在尝试做一个简单的正则表达式，我想做的就是匹配单词的单数部分，无论它是否有一个 s 结尾。所以如果我有以下的话

test
tests

编辑：进一步的例子，我需要这对于许多单词来说都是可能的，而不仅仅是这两个

movie
movies
page
pages
time
times

对于他们所有人，我需要得到最后没有 s 的单词，但我找不到一个正则表达式，它总是会在没有 s 的情况下抓住第一位并且适用于这两种情况。

我尝试了以下方法：

([a-zA-Z]+)([s\b]{0,}) - This returns the full word as the first match in both cases
([a-zA-Z]+?)([s\b]{0,}) - This returns 3 different matching groups for both words
([a-zA-Z]+)([s]?) - This returns the full word as the first match in both cases
([a-zA-Z]+)(s\b) - This works for tests but doesn't match test at all
([a-zA-Z]+)(s\b)? - This returns the full word as the first match in both cases

我一直在使用http://gskinner.com/RegExr/来尝试不同的正则表达式。

编辑：这是一个崇高的文本片段，对于那些不知道崇高文本中的片段的人来说，这是一个快捷方式，这样我就可以输入我的数据库的名称并点击“运行片段”，它会将它变成一些东西喜欢：

$movies= $this->ci->db->get_where("movies", "");
if ($movies->num_rows()) {
    foreach ($movies->result() AS $movie) {

    }
}

我需要的只是将“电影”变成“电影”并自动将其插入到 foreach 循环中。

这意味着我不能只对文本进行查找和替换，我只需要考虑 60 - 70 个单词（它只针对我自己的表格，而不是英语中的每个单词）。

谢谢！- 蒂姆

score 10 · Accepted Answer

Ok I've found a solution:

([a-zA-Z]+?)(s\b|\b)

Works as desired, then you can simply use the first match as the unpluralized version of the word.

Thanks @Jahroy for helping me find it. I added this as answer for future surfers who just want a solution but please check out Jahroy's comment for more in depth information.

score 7 · Accepted Answer

对于简单的复数，使用这个：

test(?=s| |$)

对于更复杂的复数，您在使用正则表达式时遇到了麻烦。例如，这个正则表达式

part(y|i)(?=es | )

将返回“party”或“parti”，但我不确定你会做什么

score 2 · Accepted Answer

以下是使用 vi 或 sed 的方法：

s/\([A-Za-z]\)[sS]$/\1

这会将一堆以 S 结尾的字母替换为除最后一个字母之外的所有字母。

笔记：

转义字符（括号前的反斜杠）在不同的上下文中可能不同。

还：

（这\1意味着第一个模式）也可能因上下文而异。

还：

仅当您的单词是该行中唯一的单词时，这才有效。

如果您的表名是该行中的多个单词之一，您可以将$（代表行尾）替换为表示空格或单词边界的通配符（这些因上下文而异）。

regex - 正则表达式 - 如何搜索单词的单数或复数版本

3 回答 3

Related

Reference