5

我想将第一次出现的日期或一般的正则表达式带到我的文本开头:

示例: "I went out on 1 sep 2012 and it was better than 15 jan 2012" 我想得到 "1 sep 2012, I went out on and it was better than 15 jan 2012"

我正在考虑替换"1 sep 2012"",1 sep 2012,",然后从中删除字符串,","但我不知道该写什么,而不是replace_with

line = re.sub(r'\d+\s(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\s\d{4}', 'replace_with', line, 1)

有什么帮助吗?

4

1 回答 1

8

使用捕获组

>>> import re
>>> s = "I went out on 1 sep 2012 and it was better than 15 jan 2012"
>>> r = re.compile('(^.*)(1 sep 2012 )(.*$)')
>>> r.sub(r'\2\1\3',s)
'1 sep 2012 I went out on and it was better than 15 jan 2012'

括号捕获部分字符串:

(^.*)          # Capture everything from the start of the string
(1 sep 2012 )  # Upto the part we are interested in (captured)
(.*$)          # Capture everything else

然后只需在替换`\2\1\3' 说明中重新排序捕获组:引用捕获组需要一个原始字符串r'\2\1\3'。我的示例中的第二组只是文字字符串(1 sep 2012 ),但当然这可以是任何正则表达式,例如您创建的正则表达式(\s最后有一个额外的):

(\d+\s(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\s\d{4}\s)

>>> r = re.compile(r'(^.*)(\d+\s(?:aug|sep|oct|nov)\s\d{4}\s)(.*$)')
>>> r.sub(r'\2\1\3',s)
'1 sep 2012 I went out on and it was better than 15 jan 2012'

来自docs.python.org

当存在 'r' 或 'R' 前缀时,反斜杠后面的字符将不加更改地包含在字符串中。

于 2013-01-04T08:08:04.817 回答