python - Python在2个标签之间查找字符串

Question

我正在尝试读取存储在文件中的 2 个标签之间的内容，内容可能跨越多行。标记可以在文件中出现 0 次或 1 次。

例如：文件内容可以是

title:Corruption Today: Corruption today in
content:Corruption Today: 
Corruption today in 
score:0.91750675

因此，在阅读 "Content:" 时，我的查询结果应该是 "Corruption Today: Corruption today in"。经过一番谷歌搜索后，我可以编写以下代码

myfile = open(files,'r');
filecontent = myfile.read();

startPtrs = [m.start()+8 for m in re.finditer('content:', filecontent)];
startPtr = startPtrs[0];
endPtrs = [m.start()-1 for m in re.finditer('score:', filecontent)];
endPtr = endPtrs[0];

content = filecontent[startPtr:endPtr];

我不确定上述代码的效率如何，因为我们正在遍历文件内容 2 次以检索内容。能不能做一些更有效率的事情。

score 0 · Accepted Answer

如果要查找字符串 beetwen 2 个子字符串，可以使用remoudle：

import re

myfile = open(files,'r');
filecontent = myfile.read();

results = re.compile('content(.*?)score', re.DOTALL | re.IGNORECASE).findall(filecontent)
print results

一些解释：

IGNORECASE来自文档：

执行不区分大小写的匹配；像 [AZ] 这样的表达式也将匹配小写字母。这不受当前语言环境的影响。

DOTALL 来自文档：

(Dot.) In the default mode, this matches any character except a newline. If the DOTALL flag has been specified, this matches any character including a newline.

Compile你可以在这里看到

您还可以在此处查看其他一些解决方案

python - Python在2个标签之间查找字符串

1 回答 1

Related

Reference