2

我希望能够在文档中搜索给定的字符串并找到每个实例的上下文。例如,在文档中搜索“Figure”并返回该字符串后面的 X 个字符(从“Figure-1 Super awesome figure. next sentence”返回“-1 Super awesome figure”)

我知道如何打印:A)该字符串的每个实例

mystring = "Figure"
with open('./mytext.txt', 'r') as searchfile:
    for line in searchfile:
        if mystring in line:
            print(mystring)

但这无济于事;或 B)包含该字符串的每一行

for line in open('./mytext.txt', "r"):
    if "Figure" in line:
        print(line) 

它返回整行中的所有文本,之前和之后,这对我的目的来说很麻烦。

我可以在“mystring”处拆分一行并在拆分后返回 X 个字符吗?还是有更好的方法?

4

3 回答 3

3

我会这样做:

WANTED = 20 #or however many characters you want after 'Figure'

with open('mytext.txt') as searchfile:
    for line in searchfile:
        left,sep,right = line.partition('Figure')
        if sep: # True iff 'Figure' in line
            print(right[:WANTED])

看:str.partition

于 2013-11-13T00:40:21.013 回答
0

你可以这样做:

line = "Figure-1 Super awesome figure. next sentence."

search_line = line.split("Figure")

print search_line

# prints ['', '-1 Super awesome figure. next sentence.']

count = 0
for elem in search_line: 
    count += len(elem)

print count # how many chars after "Figure"
于 2013-11-13T00:34:25.087 回答
0
import re
X = len("-1 Super awesome figure")
regex = re.compile("Figure.{%d}" % X)
for line in open("mytext.txt"):
  for m in regex.findall(line):
    print m

您可能需要澄清“在该字符串之后返回 X 个字符”的含义。

于 2013-11-13T00:38:26.797 回答