1

什么是“解释”字符串中格式控制字符的最简单方法,以显示结果,就好像它们已打印一样。为简单起见,我假设字符串中没有换行符。

例如,

>>> sys.stdout.write('foo\br')

显示for,因此

interpret('foo\br')应该'for'

>>>sys.sdtout.write('foo\rbar')

显示bar,因此

interpret('foo\rbar')应该'bar'


我可以在这里写一个正则表达式替换,但是,在'\b'替换的情况下,它必须递归地应用,直到不再出现。如果没有递归完成,那将是相当复杂的。

有没有更简单的方法?

4

3 回答 3

1

Python's does not have any built-in or standard library module for doing this. However if you only care for simple control characters like \r, \b and \n you can write a simple function to handle this:

def interpret(text):
    lines = []
    current_line = []
    for char in text:
        if char == '\n':
            lines.append(''.join(current_line))
            current_line = []
        elif char == '\r':
            current_line.clear()
            # del current_line[:]  # in old python versions
        elif char == '\b':
            del current_line[-1:]
        else:
            current_line.append(char)
    if current_line:
        lines.append(current_line)
    return '\n'.join(lines)

You can extend the function handling any control character you want. For example you might want to ignore some control characters that don't get actually displayed in a terminal (e.g. the bell \a)

于 2014-09-17T05:24:12.033 回答
1

如果效率无关紧要,一个简单的堆栈就可以了:

string = "foo\rbar\rbash\rboo\b\bba\br"

res = []
for char in string:
    if char == "\r":
        res.clear()
    elif char == "\b":
        if res: del res[-1]
    else:
        res.append(char)

"".join(res)
#>>> 'bbr'

否则,我认为这与您在复杂情况下所希望的一样快:

string = "foo\rbar\rbash\rboo\b\bba\br"

try:
    string = string[string.rindex("\r")+1:]
except ValueError:
    pass

split_iter = iter(string.split("\b"))
res = list(next(split_iter, ''))
for part in split_iter:
    if res: del res[-1]
    res.extend(part)

"".join(res)
#>>> 'bbr'

请注意,我没有计时。

于 2014-09-16T21:49:28.037 回答
0

更新:在要求澄清和示例字符串 30 分钟后,我们发现问题实际上完全不同:“如何将格式化控制字符(退格)重复应用于Python 字符串?” 在这种情况下,是的,您显然需要重复应用正则表达式/fn,直到您停止获得匹配项。解决方案:

import re

def repeated_re_sub(pattern, sub, s, flags=re.U):
    """Match-and-replace repeatedly until we run out of matches..."""
    patc = re.compile(pattern, flags)

    sold = ''
    while sold != s:
        sold = s
        print "patc=>%s<    sold=>%s<   s=>%s<" % (patc,sold,s)
        s = patc.sub(sub, sold)
        #print help(patc.sub)

    return s

print repeated_re_sub('[^\b]\b', '', 'abc\b\x08de\b\bfg')
#print repeated_re_sub('.\b', '', 'abcd\b\x08e\b\bfg')

[多个先前的答案,要求澄清并指出两者re.sub(...)string.replace(...)都可用于非递归地解决问题。]

于 2014-09-16T20:57:35.923 回答