在 python 中处理文本文件时,似乎 .tell() 方法不是很可靠。我正在尝试使用这种方法来代替在其他编程语言中发现的 EOF 条件。
由于各种原因,我不想使用 FOR 循环来迭代文本文件,而是使用 WHILE 循环。
下面是一些复制问题的代码。我已经包含了将以随机方式生成 test.txt 文本文件的代码:
import re
from random import randint
def file_len_lines(f_name):
with open(f_name) as f:
for i, l in enumerate(f):
pass
return i + 1
def file_len_chars(f_name, with_nls):
char_count = 0
with open(f_name) as f:
for line in f:
char_count += len(line)
if with_nls:
char_count += 1
else:
pass
return char_count
def trim(sut):
return re.sub(' +', ' ', sut).strip()
# Create test file
with open("test.txt", "w") as f:
word_list = ("Betty Eats Cakes And Uncle Sells Eggs "*20).split()
word_list[3] = ""
# for num in range(len(word_list)):
# if randint(1, 2) == 1:
# word_list[num] = ""
for word in word_list:
print(word, file=f)
file_to_read = 'test.txt'
# file_to_read = 'Fibonacci Tree 01.log'
with open(file_to_read, "r") as f:
count = 0
file_length = file_len_chars(file_to_read, True)
file_length_lines = file_len_lines(file_to_read)
print(f"Lines in file = {file_length_lines}, Characters in file = {file_length}")
f.seek(0)
while f.tell() < file_length:
count += 1
text_line = f.readline()
print(f"Line = {count}, ", end="")
print(f"Tell = {f.tell()}, ", end="")
print(f"Length {len(text_line)} ", end="")
if text_line in ['', '\n']:
print(count)
elif trim(text_line).upper()[0] in "A E I O U".split():
print(text_line, end='')
else:
print(count)
此代码应始终输出如下内容:
Lines in file = 140, Characters in file = 897
Line = 1, Tell = 7, Length 6 1
Line = 2, Tell = 13, Length 5 Eats
Line = 3, Tell = 20, Length 6 3
...
Line = 138, Tell = 884, Length 6 Uncle
Line = 139, Tell = 891, Length 6 139
Line = 140, Tell = 897, Length 5 Eggs
Process finished with exit code 0
但相反,它主要输出类似于:
Lines in file = 140, Characters in file = 605
Line = 1, Tell = 7, Length 6 1
Line = 2, Tell = 18446744073709551630, Length 5 Eats
Process finished with exit code 0
您可以看到,在上面输出的最后一行,.tell() 方法的输出变得混乱,没有循环遍历所有 140 行。
我正在寻找一种方法来使 .tell() 方法运行或以另一种方式检测 EOF 条件以中断 WHILE 循环。
同样,网上找到的大多数建议都说“使用 FOR 循环进行迭代”。出于各种难以解释的原因,我不想这样做。(简而言之,由于我打算遵循嵌套流程图,这将使我的原始代码的性质非常笨拙。)