css - 使用 SED 和 RegEx 在数百个 CSS 文件的链接中添加下划线

Question

我有数百（超过 700）组网络文件夹，每组都包含离散的 CSS 样式表。（如果您好奇，它们是在线课程。）

最近做出了一个决定，链接应该有下划线。我知道 W3C 很久以前就决定了，但这是一所大学，他们喜欢重新决定事情。

我一直在尝试使用 RegEx 搜索和替换来更新所有 CSS 文件。

迄今为止的主要障碍是：

视窗。我不喜欢它，我没有使用它。FART 之类的命令行实用程序非常适合单行内容，但事实证明，编写更自定义和更强大的搜索对它来说太过分了。
多线。CSS 文件的结构通常是这样的：
```
a, .surveypopup{
text-decoration:none;
    cursor:pointer;
}
```
这意味着选择器（“{”之前的部分）始终与好东西位于不同的行上。我想匹配所有在没有事件的情况下修改“a”的选择器（例如：hover），并确保任何带有“text-decoration：none”的内容都变为“text-decoration：underline”，而不会弄乱任何其他可能夹在中间的样式代码之间。
不区分大小写。对于 RegEx，这应该不是问题。这个 CSS 的作者可能会也可能不会对他们的大写进行创意。

我目前出错的命令行是这样的：

find . -iname "*.css" | xargs sed -i "" "s|\(\ba\(,\|\.\|\s\|\b\)\[^\{\]\*\{\[^\}\]\*\)text-decoration\:none|a.\1text-decoration:underline;|g"

产生：

sed: 1: "s|\(\ba\(,\|\.\|\s\|\b\ ...": RE error: invalid repetition count(s)

我想知道我的需求是否证明编写 bash 脚本是合理的？如果需要修改，最好为每个文件创建一个备份。像这样的多个操作在脚本中会更容易......

无论哪种方式，我认为我遇到了问题，因为我不知道 sed 应该逃避什么，什么不应该逃避。

请帮忙！

score 6 · Accepted Answer

一次对整个文件进行操作，您可以使用：

s/(\ba(?=(?:\.|,|\s|{|#)))([^}{]*?{[^}]*?text-decoration:\s*)none(\s?!important)?;/$1$2underline;/g

格式更好，这是：

s/                          # find and replace
    (                       # group 1
        \b                  # a word boundary
        a                   # followed by 'a'
        (?=                 # where the next character (positive lookahead)
            (?:             # (inside a non-capturing group)
              \.|,|\s|{|#   # is one of '.', ',', '{', '#' or whitespace
            ) 
        )
    )
    (                       # group 2
        [^}{]*?             # then non-greedily match anything up to a '{' or '}'
                            # if '}' is found, the next character will not match
                            # and therefore the whole regex will not match
        {                   # and find the '{'
        [^}]*?              # and then non-greedily match anything until we 
                            # find 'text-decoration', but don't keep matching
                            # when a '}' is found
        text-decoration:    # then find 'text-decoration'
        \s*                 # and optional whitespace
    )
    none                    # and 'none'
    (\s?!important)?        # and optional '!important'
    ;                       # and a ';'
/
    $1                      # replace by group 1
    $2                      # then group 2
    underline;              # then 'underline;'
/g

示例文件：

$ cat test.css
a { text-decoration: none; }
b, a { text-decoration: none; }
b, a, u { text-decoration: none; }
b, a.cat, u { text-decoration: none; }
b, a.cat, u { text-decoration: none !important; }
b, a, u {
    text-decoration: none;
}
b, a, u {
    color: red;
    text-decoration: none;
}
b, a, u {
    color: red;
    text-decoration: none;
    padding: 10px;
}

结果：

perl -0777 -p -e 's/(\ba(?=(?:\.|,|\s|{|#)))([^}{]*?{[^}]*?text-decoration:\s*)none(\s?!important)?;/$1$2underline;/g' test.css
a { text-decoration: underline; }
b, a { text-decoration: underline; }
b, a, u { text-decoration: underline; }
b, a.cat, u { text-decoration: underline; }
b, a.cat, u { text-decoration: underline; }
b, a, u {
    text-decoration: underline;
}
b, a, u {
    color: red;
    text-decoration: underline;
}
b, a, u {
    color: red;
    text-decoration: underline;
    padding: 10px;
}

您可以使用 perl 的-i标志（不要忘记设置备份扩展名！）对文件进行就地操作。

显然还有很多其他可能的 CSS 规则可以包括a; 例如html>a或div a b；这个正则表达式不会找到第一个，会找到第二个，但在这两种情况下都是“错误的”。基本上，只有当您可以对您正在操作的文本做出强有力的假设时，您才能将正则表达式用于这些类型的任务。

更新添加}到规则的一部分以避免匹配，例如：

b { background-image: url('http://domain.com/this is a picture.jpg'); }
u { text-decoration: none; }

score 4 · Accepted Answer

您不应该使用 RegEx 来解析 CSS。改为使用 CSS 解析器，您将为自己省去很多麻烦。

css - 使用 SED 和 RegEx 在数百个 CSS 文件的链接中添加下划线

2 回答 2

Related

Reference