regex - 完整路径中最多包含两个“a”字符的文件

Question

我试图弄清楚如何使用 AWK 查找在其完整路径中最多包含两个“a”字符的文件。

以下是我到目前为止提出的，但它没有做这项工作。

BEGIN{}

{
if( match( $1, ".*[a].*[a].*[^a]+" ) )
print $1
}

END{}

它从通过以下命令单独创建的名为“data”的文件中读取文件名及其完整路径。

find / -name '*'

我应该修改什么？

score 6 · Accepted Answer

以下内容被认为太短而无法单独回答，但这就是我要写的全部内容：

^[^a]*(a[^a]*(a[^a]*)?)?$

顺便说一句，你不需要awk. grep -E可以正常工作。

但是现在想起来，如果要使用awk的话，下面的就更简单了：

awk '!/a.*a.*a/'

score 2 · Accepted Answer

You have three errors.

You need to include the start-of-line and end-of-line patterns ^ and $ otherwise an arbitrary prefix or suffix may contain some as.
You need to make the occurrences of a optional, by using parenthesis and ?.
.* can contain a so you need to use [^a] to match the non-a characters.

The result would be a regular expression like:

^([^a]*a)?([^a]*a)?[^a]*$

Edit:

As Ed points out in the comments below his answer, if you pass the --re-interval flag to Awk, you can use intervals.

The expression would then be:

^([^a]*a){0,2}[^a]*$

This allows us say we want to find between 0 and 2 as.

score 2 · Accepted Answer

正确的解决方案是这样的：

awk '!/(.*a){3}/' file

或者如果您的 awk 不支持 RE 间隔，则使用其中任何一种：

awk 'gsub(/a/,"&") < 3' file
awk 'split($0,x,/a/) < 3' file

因此，无论哪种情况，如果您想测试少于 17 个“a”，您只需将 3 更改为 17（例如）：

awk '!/(.*a){17}/' file

而不是写：

awk '^[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*(a[^a]*)?)?)?)?)?)?)?)?)?)?)?)?)?)?)?)?$'

或类似的。

regex - 完整路径中最多包含两个“a”字符的文件

3 回答 3

Related

Reference