python - 如何删除 Python 三重引号多行字符串的额外缩进？

Question

我有一个 python 编辑器，用户在其中输入脚本或代码，然后将其放入幕后的 main 方法中，同时每行都缩进。问题是，如果用户有一个多行字符串，则对整个脚本所做的缩进会通过在每个空格中插入一个制表符来影响字符串。一个问题脚本会很简单：

"""foo
bar
foo2"""

因此，当在 main 方法中时，它看起来像：

def main():
    """foo
    bar
    foo2"""

并且该字符串现在将在每行的开头有一个额外的制表符。

score 145 · Accepted Answer

145

标准库中的textwrap.dedent可以自动撤消古怪的缩进。

于 2009-09-11T19:02:18.523 回答

score 62 · Accepted Answer

据我所见，一个更好的答案可能是inspect.cleandoc，它可以做很多事情textwrap.dedent，但也解决textwrap.dedent了引导线的问题。

以下示例显示了差异：

>>> import textwrap
>>> import inspect
>>> x = """foo bar
    baz
    foobar
    foobaz
    """
>>> inspect.cleandoc(x)
'foo bar\nbaz\nfoobar\nfoobaz'
>>> textwrap.dedent(x)
'foo bar\n    baz\n    foobar\n    foobaz\n'
>>> y = """
...     foo
...     bar
... """
>>> inspect.cleandoc(y)
'foo\nbar'
>>> textwrap.dedent(y)
'\nfoo\nbar\n'
>>> z = """\tfoo
bar\tbaz
"""
>>> inspect.cleandoc(z)
'foo\nbar     baz'
>>> textwrap.dedent(z)
'\tfoo\nbar\tbaz\n'

请注意，这inspect.cleandoc也会将内部制表符扩展到空格。这可能不适合一个人的用例，但对我来说效果很好。

score 20 · Accepted Answer

多行字符串第一行之后的内容是字符串的一部分，解析器不将其视为缩进。你可以随意写：

def main():
    """foo
bar
foo2"""
    pass

它会做正确的事。

另一方面，这是不可读的，Python 知道这一点。因此，如果文档字符串的第二行中包含空格，那么当您用于help()查看文档字符串时，会删除该数量的空格。因此，help(main)下面help(main2)会产生相同的帮助信息。

def main2():
    """foo
    bar
    foo2"""
    pass

score 2 · Accepted Answer

更清楚地显示textwrap.dedent和之间的区别：inspect.cleandoc

前导部分未缩进的行为

import textwrap
import inspect

string1="""String
with
no indentation
       """
string2="""String
        with
        indentation
       """
print('string1 plain=' + repr(string1))
print('string1 inspect.cleandoc=' + repr(inspect.cleandoc(string1)))
print('string1 texwrap.dedent=' + repr(textwrap.dedent(string1)))
print('string2 plain=' + repr(string2))
print('string2 inspect.cleandoc=' + repr(inspect.cleandoc(string2)))
print('string2 texwrap.dedent=' + repr(textwrap.dedent(string2)))

输出

string1 plain='String\nwith\nno indentation\n       '
string1 inspect.cleandoc='String\nwith\nno indentation\n       '
string1 texwrap.dedent='String\nwith\nno indentation\n'
string2 plain='String\n        with\n        indentation\n       '
string2 inspect.cleandoc='String\nwith\nindentation'
string2 texwrap.dedent='String\n        with\n        indentation\n'

前导部分缩进的行为

string1="""
String
with
no indentation
       """
string2="""
        String
        with
        indentation
       """

print('string1 plain=' + repr(string1))
print('string1 inspect.cleandoc=' + repr(inspect.cleandoc(string1)))
print('string1 texwrap.dedent=' + repr(textwrap.dedent(string1)))
print('string2 plain=' + repr(string2))
print('string2 inspect.cleandoc=' + repr(inspect.cleandoc(string2)))
print('string2 texwrap.dedent=' + repr(textwrap.dedent(string2)))

输出

string1 plain='\nString\nwith\nno indentation\n       '
string1 inspect.cleandoc='String\nwith\nno indentation\n       '
string1 texwrap.dedent='\nString\nwith\nno indentation\n'
string2 plain='\n        String\n        with\n        indentation\n       '
string2 inspect.cleandoc='String\nwith\nindentation'
string2 texwrap.dedent='\nString\nwith\nindentation\n'

score 1 · Accepted Answer

我看到的唯一方法是从第二个开始为每行剥离前 n 个选项卡，其中 n 是主要方法的已知标识。

如果事先不知道该标识 - 您可以在插入之前添加尾随换行符并从最后一行中删除标签数量......

第三种解决方案是解析数据并找到多行引号的开头，并且不要将您的标识添加到每一行之后，直到它被关闭。

认为有更好的解决方案..

score 1 · Accepted Answer

我想准确地保留三引号行之间的内容，只删除常见的前导缩进。我发现了这一点texwrap.dedent，inspect.cleandoc但做得不太对，所以我写了这个。它使用os.path.commonprefix.

import re
from os.path import commonprefix

def ql(s, eol=True):
    lines = s.splitlines()
    l0 = None
    if lines:
        l0 = lines.pop(0) or None
    common = commonprefix(lines)
    indent = re.match(r'\s*', common)[0]
    n = len(indent)
    lines2 = [l[n:] for l in lines]
    if not eol and lines2 and not lines2[-1]:
        lines2.pop()
    if l0 is not None:
        lines2.insert(0, l0)
    s2 = "\n".join(lines2)
    return s2

这可以用任何缩进引用任何字符串。我希望它默认包含尾随换行符，但可以选择删除它，以便它可以整齐地引用任何字符串。

例子：

print(ql("""
     Hello
    |\---/|
    | o_o |
     \_^_/
    """))

print(ql("""
         World
        |\---/|
        | o_o |
         \_^_/
    """))

第二个字符串有 4 个公共缩进空格，因为最后"""的缩进小于引用的文本：

 Hello
|\---/|
| o_o |
 \_^_/

     World
    |\---/|
    | o_o |
     \_^_/

我认为这会更简单，否则我不会打扰它！

score -15 · Accepted Answer

因此，如果我正确理解，您可以获取用户输入的任何内容，正确缩进并将其添加到程序的其余部分（然后运行整个程序）。

因此，在您将用户输入放入程序后，您可以运行一个正则表达式，这基本上可以恢复强制缩进。类似于：在三个引号内，将所有“新行标记”替换为后跟四个空格（或制表符），仅使用“新行标记”。

python - 如何删除 Python 三重引号多行字符串的额外缩进？

7 回答 7

前导部分未缩进的行为

前导部分缩进的行为

Related

Reference