0

嗨,我有一个关于从我的文本中查找第 # 行并使用此行# 来计算一些东西的快速问题

(这不是硬件问题,我刚开始学习 python)

ex~ 如果我的文字看起来像

 100 200 300
 400 500 600 
 700 800 900
 120 130 140
 150 160 170

f1 = open('sample4.txt','r')

line_num = 0
search_phrase = "100"

for line in f1.readlines():
line_num += 1
if line.find(search_phrase) >= 0:
    x = line_num
    print (x)

import numpy
data = numpy.loadtxt('sample4.txt')
print(data[x:x+3,1].sum())

我可以得到

1430.0 which is (200+500+800+130)

但是,如果我的文字如下所示:

apple is good
i dont like apple
100 200 300 
400 500 600 
700 800 900
120 130 140 
150 160 170
i love orange

错误弹出并说

 Traceback (most recent call last):
 File "C:/Python33/sample4.py", line 13, in <module>
 data = numpy.loadtxt('sample4.txt')
 File "C:\Python33\lib\site-packages\numpy\lib\npyio.py", line 827, in loadtxt
 items = [conv(val) for (conv, val) in zip(converters, vals)]
 File "C:\Python33\lib\site-packages\numpy\lib\npyio.py", line 827, in <listcomp>
 items = [conv(val) for (conv, val) in zip(converters, vals)]
 ValueError: could not convert string to float: b'apple'

我认为弹出此错误的原因是因为NUMPY

有什么办法可以使这个正确吗?不使用一些skip_header或skip_footer

4

1 回答 1

1

似乎 loadtxt 可以使用文件句柄作为输入,所以一个(可能是丑陋的)技巧可能是首先确定您感兴趣的文本的行,然后重新打开文件,读取前几行不感兴趣的行,然后传递文件loadtxt 句柄(未经测试):

fname = 'sample4.txt'
search_phrase = '100'

with open(fname) as fid:
    for linenum, line in enumerate(fid):
        if search_phrase in line:
            break #if the n-th line is interesting, line_num = n-1

#reopen file
with open(fname) as fid:
    for i in xrange(linenum):
        fid.readline() #throw away uninteresting lines
    data = np.loadtxt(fid) #pass file handle

print(data[:3,1].sum()) #interesting stuff is now in first row

但是使用skirows有什么问题?然后可以将第二部分更改为

#get linenum as before
data = np.loadtxt(fname, skiprows = linenum)
print(data[:3,1].sum()) #interesting stuff is now in first row
于 2013-08-06T19:00:23.880 回答