python - 如何使用正则表达式替换一定数量的空格？

Question

我正在使用 Calibre 将 PDF 转换为 MOBI，但它无法解释空格缩进的代码块。这些块包含很多空间，但数量不同。有些行甚至缩进了 31 个空格。

Calibre 允许 3 个正则表达式在转换之前在书中进行搜索和替换。

这是我尝试过的。

\n( *) ( *)([a-zA-Z{};\*\/\(\)&#0-9])

用。。。来代替：

\n\1&nbsp;\2\3

问题是，它只替换了其中一个空格。我希望将它们全部替换为相同数量的 .

我也尝试过第一组的懒惰版本等。

这是正则表达式不足的情况之一吗？我认为这个正则表达式引擎是 python 标准。

score 2 · Accepted Answer

如果这是 Perl ，你可以(\G|\n) 替换为但事实上，我能想到的唯一方法是用. . . 我怀疑到那时你会达到 Calibre 允许输入正则表达式的时间限制。：-/$1 (?<=\n {0,30})  ((?<=\n)|(?<=\n )|(?<=\n {2})|(?<=\n {3})|(?<=\n {4})|(?<=\n {5})|...|(?<=\n {30}))  

另一种选择是采用完全不同的方法，将（两个空格）替换为  （不间断空格+常规空格），而无需将其限制在行首。我猜这会满足你的需求吗？

score 1 · Accepted Answer

1

\s{31} 将完全匹配 31 个空格，\s{14,31} 14 到 31

于 2012-03-01T02:31:37.147 回答

score 1 · Accepted Answer

有什么理由不只用不间断空格替换所有空格？( r/ / /.)

It won't change the appearance of normal English text (except where the source had extraeneous double-spaces) and your code blocks will render correctly.

For fun, my attempt in Python:

>>> eight_spaces = "        hello world!"
>>> re.sub(r"^(|(?:&nbsp;)*)\s",r"\1&nbsp;",eight_spaces)
'&nbsp;      hello world!'

The idea is to replace one space at a time. It doesn't work because the re engine doesn't go back to the start of the line after a match - it consumes the string working left to right.

Note the alternation of (?: )* with the empty string, (|(?: )*), so that the capture group \1 always captures something (even the empty string.)

python - 如何使用正则表达式替换一定数量的空格？

3 回答 3

Related

Reference