在 cygwin 中,这不会返回匹配项:
$ echo "aaab" | grep '^[ab]+$'
但这确实会返回匹配项:
$ echo "aaab" | grep '^[ab][ab]*$'
aaab
两种表达方式不一样吗?有没有什么方法可以表达“字符类的一个或多个字符”而无需输入两次字符类(如秒示例)?
根据这个链接,这两个表达式应该是相同的,但也许 Regular-Expressions.info 不包括 cygwin 中的 bash。
在 cygwin 中,这不会返回匹配项:
$ echo "aaab" | grep '^[ab]+$'
但这确实会返回匹配项:
$ echo "aaab" | grep '^[ab][ab]*$'
aaab
两种表达方式不一样吗?有没有什么方法可以表达“字符类的一个或多个字符”而无需输入两次字符类(如秒示例)?
根据这个链接,这两个表达式应该是相同的,但也许 Regular-Expressions.info 不包括 cygwin 中的 bash。
grep
有多种匹配“模式”,默认情况下只使用一个基本集合,它不能识别许多元字符,除非它们被转义。您可以将 grep 置于扩展或 perl 模式以+
进行评估。
来自man grep
:
Matcher Selection
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)
-P, --perl-regexp
Interpret PATTERN as a Perl regular expression. This is highly experimental and grep -P may warn of unimplemented features.
Basic vs Extended Regular Expressions
In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).
Traditional egrep did not support the { meta-character, and some egrep implementations support \{ instead, so portable scripts should avoid { in grep -E patterns and should use [{] to match a literal {.
GNU grep -E attempts to support traditional usage by assuming that { is not special if it would be the start of an invalid interval specification. For example, the command grep -E '{1' searches for the two-character string {1 instead of reporting a syntax
error in the regular expression. POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.
或者,您可以使用egrep
代替grep -E
.
在基本的正则表达式中,元字符
?
,+
,{
,|
,(
, 和)
失去了它们的特殊含义;而是使用反斜杠版本 \?,\+
,\{
,\|
,\(
, 和\)
.
所以使用反斜杠版本:
$ echo aaab | grep '^[ab]\+$'
aaab
或激活扩展语法:
$ echo aaab | egrep '^[ab]+$'
aaab
用反斜杠屏蔽,或者 egrep 作为扩展的 grep,别名grep -e
:
echo "aaab" | egrep '^[ab]+$'
aaab
echo "aaab" | grep '^[ab]\+$'
aaab