0

如何在大文本文件中存储从某个单词开始并以某个单词结尾的文本数据(使用 python 和正则表达式)。

这是文本文件的一部分:

X_FUNCTION = linear
TITLE =
netlist_run
Vnet04  YUNITS = volts
+  0.000000000000000E+00 -4.000000000000000E-01  3.636363636363636E-02 -4.000000000000000E-01
+  7.272727272727272E-02 -4.000000000000000E-01  1.090909090909091E-01 -4.000000000000000E-01
+  1.454545454545454E-01 -4.000000000000000E-01  1.818181818181818E-01 -4.000000000000000E-01
+  2.181818181818182E-01 -4.000000000000000E-01  2.545454545454546E-01 -4.000000000000000E-01
+  2.909090909090910E-01 -4.000000000000000E-01  3.272727272727273E-01 -4.000000000000000E-01
Vnet05  YUNITS = volts
+  0.000000000000000E+00  3.000000000000000E+00  3.636363636363636E-02  3.000000000000000E+00
+  7.272727272727272E-02  3.000000000000000E+00  1.090909090909091E-01  3.000000000000000E+00
+  1.454545454545454E-01  3.000000000000000E+00  1.818181818181818E-01  3.000000000000000E+00
+  2.181818181818182E-01  3.000000000000000E+00  2.545454545454546E-01  3.000000000000000E+00
+  2.909090909090910E-01  3.000000000000000E+00  3.272727272727273E-01  3.000000000000000E+00
vbs_i  YUNITS = amps
+  0.000000000000000E+00  3.881535006369462E-12  3.636363636363636E-02  3.958355883215995E-12
+  7.272727272727272E-02  4.155732392087960E-12  1.090909090909091E-01  4.661608907762973E-12
+  1.454545454545454E-01  5.953136322408749E-12  1.818181818181818E-01  9.230381781895836E-12
+  2.181818181818182E-01  1.746801289794467E-11  2.545454545454546E-01  3.787865538450135E-11
+  2.909090909090910E-01  8.739483655864867E-11  3.272727272727273E-01  2.040272699537106E-10

我想开始保存从行开始Vnet04 YUNITS = volts直到对象中行开始之前的数据,Vnet05 YUNITS = volts比如说a。然后我想再次保存从行开始直到对象Vnet05 YUNITS = volts中行开始之前的数据vbs_i YUNITS = ampsb

因为我的文本文件超过 1000k 行。我只想解析一次。

4

2 回答 2

0

1)为您的“开始”和“停止”行编写一个正则表达式匹配

请参阅http://docs.python.org/2/library/re.html#examples 正如评论者所提到的,您可能不需要正则表达式来执行此操作

2)逐行读取行并与开始和停止行进行比较,使用它来设置状态变量true或false

3) 如果状态变量为真,则使用 append 将行添加到数组中

在文件的末尾,您应该在数组中有感兴趣的行

于 2013-04-15T11:40:32.417 回答
0

这是代码:

#!/usr/bin/env python3

import re

ins = open( "test.txt", "r" )
array = []
array2 = []
i = 0
for line in ins:
    if ((re.match('Vnet05 YUNITS = volts) or re.match('vbs_i YUNITS = amps')) and (i != 0 ):
        array.append(array2)
        i += 1
        array2 = []
        array2.append( line )
    elif line.startswith('+') :
        i += 1
        array2.append(line)

array.append(array2)
for line in array:
    print("Object :")
    print(line)
于 2013-04-15T12:09:02.283 回答