1

我正在从文本文件中读取文本,然后重新格式化该文本以写入不同的文本文件。

我正在阅读的文字如下testFile.txt

                  *******************************
                  *  Void Fractions in the Bed  *
                  *******************************

     Z(m)    MIN.FLUIDIZ.  EMULSION    TOTAL

0.0000E+00  0.4151E+00  0.8233E+00  0.8233E+00
0.1000E-09  0.4151E+00  0.8233E+00  0.8233E+00
0.1000E-05  0.4151E+00  0.8233E+00  0.8233E+00
0.2000E-05  0.4151E+00  0.8233E+00  0.8233E+00
0.1251E+01  0.4151E+00  0.9152E+00  0.9152E+00
0.1301E+01  0.4151E+00  0.9152E+00  0.9152E+00
0.1333E+01  0.4151E+00  0.9152E+00  0.9152E+00


               *************************************
               *  Void Fractions in the Freeboard  *
               *************************************

     Z(m)    VOID FRACTION

0.1333E+01  0.9992E+00
0.1333E+01  0.9992E+00
0.1333E+01  0.9992E+00
0.1333E+01  0.9992E+00
0.3533E+01  0.9992E+00
0.3633E+01  0.9992E+00
0.3733E+01  0.9992E+00
0.3833E+01  0.9992E+00
0.3933E+01  0.9992E+00
0.4000E+01  0.9992E+00


           *********************************************
           *  Superficial Velocities in the Bed (m/s)  *
           *********************************************

     Z(m)    MIN.FLUIDIZ.  ACTUAL

0.0000E+00  0.1235E+00  0.4911E+01
0.1000E-09  0.1235E+00  0.4911E+01
0.1000E-05  0.1235E+00  0.4911E+01
0.2000E-05  0.1235E+00  0.4911E+01
0.3000E-05  0.1235E+00  0.4911E+01
0.1151E+01  0.1235E+00  0.4915E+01
0.1201E+01  0.1235E+00  0.4915E+01
0.1251E+01  0.1235E+00  0.4915E+01
0.1301E+01  0.1235E+00  0.4915E+01
0.1333E+01  0.1235E+00  0.4915E+01

下面是我解析文本文件的 Python 代码:

openFile = open('testFile.txt','r')

groupOneFile = open('groupOneFile.csv','w')
groupTwoFile = open('groupTwoFile.csv','w')
groupThreeFile = open('groupThreeFile.csv','w')

idx = 0;
firstIdx = 0;
secondIdx = 0;
thirdIdx = 0;

for line in openFile:

    # first group
    if '*  Void Fractions in the Bed  *' in line:
        print line
        firstIdx = idx

    if idx in range(firstIdx+5,firstIdx+43):
        line = line.lstrip()
        line = line.replace('  ',',')
        groupOneFile.write(line)

    # second group
    if '*  Void Fractions in the Freeboard  *' in line:
        print line
        secondIdx = idx

    if idx in range(secondIdx+5,secondIdx+43):
        line = line.lstrip()
        line = line.replace('  ',',')
        groupTwoFile.write(line)        

    # third group
    if '*  Superficial Velocities in the Bed (m/s)  *' in line:
        print line
        thirdIdx = idx

    if idx in range(thirdIdx+5,thirdIdx+43):
        line = line.lstrip()
        line = line.replace('  ',',')
        groupThreeFile.write(line)

    idx += 1

openFile.close()

groupOneFile.close()
groupTwoFile.close()
groupThreeFile.close()

groupOneFile应该有以下数据:

0.0000E+00,0.4151E+00,0.8233E+00,0.8233E+00
0.1000E-09,0.4151E+00,0.8233E+00,0.8233E+00
0.1000E-05,0.4151E+00,0.8233E+00,0.8233E+00
0.2000E-05,0.4151E+00,0.8233E+00,0.8233E+00
0.1251E+01,0.4151E+00,0.9152E+00,0.9152E+00
0.1301E+01,0.4151E+00,0.9152E+00,0.9152E+00
0.1333E+01,0.4151E+00,0.9152E+00,0.9152E+00

groupTwoFile应具有以下内容:

0.1333E+01,0.9992E+00
0.1333E+01,0.9992E+00
0.1333E+01,0.9992E+00
0.1333E+01,0.9992E+00
0.3533E+01,0.9992E+00
0.3633E+01,0.9992E+00
0.3733E+01,0.9992E+00
0.3833E+01,0.9992E+00
0.3933E+01,0.9992E+00
0.4000E+01,0.9992E+00

等等groupThreeFile

读取主文本文件并将数据写入其他文件工作正常。问题是写入的数据groupOneFile也被写入其他文件的开头groupTwoFilegroupThreeFile. 我怎样才能防止这种情况发生?

4

4 回答 4

1

为了让它工作,你可以初始化

firstIdx = 1000000
secondIdx = 1000000
thirdIdx = 1000000

因为问题是,如果您将它们设置为,0那么第一行将在所有组的范围内。

但是请注意,此代码效率非常低...更好的方法可能是:

outputFile = None

for line in openFile:
    if '*  Void Fractions in the Bed  *' in line:
        idx = 0; outputFile = groupOneFile
    elif '*  Void Fractions in the Freeboard  *' in line:
        idx = 0; outputFile = groupTwoFile
    elif '*  Superficial Velocities in the Bed (m/s)  *' in line:
        idx = 0; outputFile = groupThreeFile

    if outputFile and 5 <= idx < 43:
        line = line.lstrip()
        line = line.replace('  ',',')
        outputFile.write(line)

    idx = idx + 1

在 Python 中,如果您if x in range(a, b):为每个元素编写检查(或在 Python 2.x 中构建从ato的所有整数的实际列表b-1),则每次执行测试时都已完成。更好的是将测试编写为if a <= x < b:.

另请注意,这2.5 in range(0, 10)将返回 false(当然0 <= 2.5 < 10是 true)。

在 Python 中没有switch语句,但您可以构建一个调度表来代替:

filemap = [('*  Void Fractions in the Bed  *', groupOneFile),
           ('*  Void Fractions in the Freeboard  *', groupTwoFile),
           ('*  Superficial Velocities in the Bed (m/s)  *', groupThreeFile)]

outputFile = None
for line in openFile:
    for tag, file in filemap:
        if tag in line:
            idx = 0
            outputFile = file
    if outputFile and 5 <= idx < 43:
        outputFile.write(line)
    idx += 1

如果完全匹配是可能的(而不是in测试),这可以使用字典做得更好:

filemap = {'*  Void Fractions in the Bed  *': groupOneFile,
           '*  Void Fractions in the Freeboard  *': groupTwoFile,
           '*  Superficial Velocities in the Bed (m/s)  *': groupThreeFile)}

outputFile = None
for line in openFile:
    f = filemap.get(line.strip())
    if f:
        # Found a new group header, switch output file
        idx = 0
        outputFile = f
    if outputFile and 5 <= idx < 43:
        outputFile.write(line)
    idx += 1
于 2013-11-13T00:02:21.933 回答
1

你问我的建议,所以在这里

from itertools import groupby, product

groups = {'*  Void Fractions in the Bed  *': 'groupOneFile.csv',
          '*  Void Fractions in the Freeboard  *': 'groupTwoFile.csv',
          '*  Superficial Velocities in the Bed (m/s)  *': 'groupThreeFile.csv'}

fname = None

with open('testFile.txt','r') as fin:
    for k, group in groupby(fin, lambda x:x[0].isspace()):
        if k:
            for i, g in product(group, groups):
                if g in i:
                    fname = groups[g]
                    break
        else:
            with open(fname, 'w') as fout:
                fout.writelines(','.join(s.split())+'\n' for s in group)
于 2013-11-13T00:23:46.127 回答
0

secondIdxthirdIdx从 0 开始,这意味着if idx in range(secondIdx+5,secondIdx+43):在接近文件顶部的行上触发。

要解决此问题,您可以重写为更有状态的设置(当您阅读时Void Fractions in the Bed,您写入第一个文件,直到找到新标题等)或简单地将您Idx的 s 初始化为-100左右。

于 2013-11-13T00:02:50.137 回答
0
with open("testFile.txt") as f:
  lines = list(f)

firstIdx = secondIdx = thirdIdx = None
for x, line in enumerate(lines):
  if "*  Void Fractions in the Bed  *" in line:
    firstIdx = x
  elif "*  Void Fractions in the Freeboard  *" in line:
    secondIdx = x
  elif "*  Superficial Velocities in the Bed (m/s)  *" in line:
    thirdIdx = x

def write_lines(start, end, filename):
  with open(filename, "w") as f:
    for line in lines[start:end]:
      f.write(line.replace("  ", ","))

if firstIdx is not None:
  write_lines(firstIdx + 5, firstIdx + 43, "groupOneFile.csv")
if secondIdx is not None:
  write_lines(secondIdx + 5, secondIdx + 43, "groupTwoFile.csv")
if thirdIdx is not None:
  write_lines(thirdIdx + 5, thirdIdx + 43, "groupThreeFile.csv")
于 2013-11-13T00:08:21.567 回答