regex - ColdFusion 从文本文件中删除空行

Question

我正在使用以下代码来更新 robots.txt，具体取决于特定页面被标记为允许还是禁止。

<cflock type="exclusive" timeout="5">
    <cfset vRemoveLine = ListContainsNoCase(robots,"Disallow: #sURL#", "#chr(13)##chr(10)#")>
    <cfif vRemoveLine>
        <cfset robots = ListDeleteAt(robots, vRemoveLine, "#chr(13)##chr(10)#")>
    </cfif>
    <cffile action="write"
        file="#sitePath#robots.txt"
        output="#robots#"
        nameconflict="overwrite">
</cflock>

但是，它还没有完成和/或可以写得更好。具体来说，当删除一行时，它也不会删除其相关的回车，如果该行位于除底部之外的任何位置，则更是如此。

截图：

1) 拆线前

在此处输入图像描述

2) 拆线后

在此处输入图像描述

还要注意底部的附加空白行。除了删除不允许及其换行符之外，我还需要丢失所有这些空白行。

score 2 · Accepted Answer

实际上，更加注意您的代码，您可以简单地做......

<cfset robots = robots.replaceAll( "(?m)^Disallow: #ReEscape(sURL)#(?:\r?\n|\z)" , "" ) />

...而不是那些 List 函数。

这将删除您刚刚删除的行的换行符，但不会删除文件中其他地方存在的任何换行符（可能用于拆分部分并提高可读性）。

当然，如果您想确保文件末尾没有空格，您当然也可以使用 trim。

作为解释，这里又是上面的正则表达式，以扩展/注释形式：

(?x)    ## enable extended/comment mode
        ## (literal whitespace is ignored, hashes start comments, also ignored)
(?m)    ## enable multiline mode
        ## (meaning  ^ and $ match start/end of each line, as well as of entire input)

^Disallow:\  ## Match literal text "Disallow: " at start of a line.
             ## (In comment mode, a \ is needed before the space
             ##  in standard use this is not required.)

#ReEscape(sURL)#   ## use ReEscape to avoid issues since the URL might
                   ## contain characters that are non-literal in a regex.

(?:     ## non-capturing group to contain alternation between...

    \r?\n   ## match optional carriage return followed by a newline.
|       ## or
    \z      ## match end of input (whether there is a newline there or not)
)

（要在 CFML 中使用它，请将其包装在 cfsavecontent 和 cfoutput 中，然后将结果变量放入robot.replaceAll(here,'').）

如果您真的想确保文件中没有多个换行符（无论与删除不允许行相关的任何更改），最简单的方法是：

<cfset robots = robots.trim().replaceAll('\r','').replaceAll('\n{2,}','\n') />

它修剪两端，然后删除所有回车符，然后用一个换行符替换至少两个换行符的所有实例。

（但总的来说，我可能会推荐最初的更具体的表达方式，而不是一揽子删除多个换行符。）

regex - ColdFusion 从文本文件中删除空行

1 回答 1

Related

Reference