python - split() 没有产生预期的结果

Question

我对 python split 有一个问题，我无法弄清楚我缺少什么导致 split 函数无法正常工作。我以前一直在使用类似的拆分，它们工作得很好。

content=open(file).read)()
Sep = content.split(r'Document [a-zA-Z0-9]{25}\n')

我正在阅读的文件非常简单：

"I like coffee.

Document CLASSAR020181030eeat0000l

I like tea as well.

Document CLASSAR020181030eeat0000l

I like both coffee and tea."

score 3 · Accepted Answer

str.split()使用固定分隔符而不是正则表达式拆分。你需要使用re.split().

import re
sep = re.split(r'Document [a-zA-Z0-9]{25}\n', content)

score 0 · Accepted Answer

错误 - 字符串方法上的正则表达式语法

content是一个字符串。您不能split在此变量上调用该方法，因为它将调用string需要分隔符的基于 - 的方法。此分隔符必须是固定字符串，而不是正则表达式。

您可以改为使用正则表达式模块中的方法，因为您使用的是正则表达式语法：

import re

with open(file, 'r') as fp:
    content = fp.read()

pattern = re.compile(r'Document \w{25}\n')
separated = pattern.split(content)