69

Trying to get to grips with regular expressions in Python, I'm trying to output some HTML highlighted in part of a URL. My input is

images/:id/size

my output should be

images/<span>:id</span>/size

If I do this in Javascript

method = 'images/:id/size';
method = method.replace(/\:([a-z]+)/, '<span>$1</span>')
alert(method)

I get the desired result, but if I do this in Python

>>> method = 'images/:id/huge'
>>> re.sub('\:([a-z]+)', '<span>$1</span>', method)
'images/<span>$1</span>/huge'

I don't, how do I get Python to return the correct result rather than $1? Is re.sub even the right function to do this?

4

4 回答 4

117

Simply use \1 instead of $1:

In [1]: import re

In [2]: method = 'images/:id/huge'

In [3]: re.sub(r'(:[a-z]+)', r'<span>\1</span>', method)
Out[3]: 'images/<span>:id</span>/huge'

还要注意正则表达式使用原始字符串( )。r'...'它不是强制性的,但消除了转义反斜杠的需要,可以说使代码更具可读性。

于 2011-08-25T13:32:01.453 回答
16

Use \1 instead of $1.

\number Matches the contents of the group of the same number.

http://docs.python.org/library/re.html#regular-expression-syntax

于 2011-08-25T13:31:55.433 回答
13

对整个匹配值的反向引用是\g<0>,请参阅re.sub文档

反向引用\g<0>替换了 RE 匹配的整个子字符串。

请参阅Python 演示

import re
method = 'images/:id/huge'
print(re.sub(r':[a-z]+', r'<span>\g<0></span>', method))
# => images/<span>:id</span>/huge

如果您需要执行不区分大小写的搜索,请添加flag=re.I

re.sub(r':[a-z]+', r'<span>\g<0></span>', method, flags=re.I)
于 2019-01-17T11:47:39.657 回答
5

对于替换部分,Python 使用\1sed 和 vi 的方式,而不是 $1Perl、Java 和 Javascript(以及其他)的方式。此外,因为\1在常规字符串中作为字符 U+0001 进行插值,所以您需要使用原始字符串或 \escape 它。

Python 3.2 (r32:88445, Jul 27 2011, 13:41:33) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> method = 'images/:id/huge'
>>> import re
>>> re.sub(':([a-z]+)', r'<span>\1</span>', method)
'images/<span>id</span>/huge'
>>> 
于 2011-08-25T13:35:52.330 回答