0

有没有办法扫描整个文本文档并说找到所有说“lol”的内容并将其替换为第一个前一章标签的 id 值?也许是这样的。

Python

x=open('source.txt')
lines = x.readlines()
for line in lines:
  if line.startswith('<text'):
    line.replace('lol', first previous chapter id value)
x.write(lines)
x.close()

源文本

<chapter id="1">
<text class="lol">
<text class="lol">
<chapter id="2">
<text class="lol">
<text class="lol">
<chapter id="3">
<text class="lol">
<text class="lol">
<chapter id="4">
<text class="lol">
<text class="lol">

结果文本

<chapter id="1">
<text class="1">
<text class="1">
<chapter id="2">
<text class="2">
<text class="2">
<chapter id="3">
<text class="3">
<text class="3">
<chapter id="4">
<text class="4">
<text class="4">
4

1 回答 1

3

试试看。基本上,您需要做的额外工作就是找到该章节 ID。另外我假设您知道要写入文件,因此我只打印每一行。

import re
with open('source.txt') as x:
    for line in x:
        if line.startswith('<chapter'):
                id = re.findall('"([^"]*)"', line) #Grabs string between matching quotations
        if line.startswith('<text'):
                line = line.replace('lol',id[0])
        print line[:-1]

输出:

<chapter id="1">
<text class="1">
<text class="1">
<chapter id="2">
<text class="2">
<text class="2">
<chapter id="3">
<text class="3">
<text class="3">
<chapter id="4">
<text class="4">
<text class="4">
于 2012-08-10T17:08:52.803 回答