python - 用python删除文件中的最后一行

Question

如何使用 python 删除文件的最后一行？

输入文件示例：

hello
world
foo
bar

输出文件示例：

hello
world
foo

我创建了以下代码来查找文件中的行数 - 但我不知道如何删除特定的行号。

    try:
        file = open("file")
    except IOError:
        print "Failed to read file."
    countLines = len(file.readlines())

score 80 · Accepted Answer

因为我经常使用许多千兆字节的文件，所以按答案中提到的循环遍历对我不起作用。我使用的解决方案：

with open(sys.argv[1], "r+", encoding = "utf-8") as file:

    # Move the pointer (similar to a cursor in a text editor) to the end of the file
    file.seek(0, os.SEEK_END)

    # This code means the following code skips the very last character in the file -
    # i.e. in the case the last line is null we delete the last line
    # and the penultimate one
    pos = file.tell() - 1

    # Read each character in the file one at a time from the penultimate
    # character going backwards, searching for a newline character
    # If we find a new line, exit the search
    while pos > 0 and file.read(1) != "\n":
        pos -= 1
        file.seek(pos, os.SEEK_SET)

    # So long as we're not at the start of the file, delete all the characters ahead
    # of this position
    if pos > 0:
        file.seek(pos, os.SEEK_SET)
        file.truncate()

score 21 · Accepted Answer

您可以使用上面的代码，然后：-

lines = file.readlines()
lines = lines[:-1]

这将为您提供一个包含除最后一行之外的所有行的行数组。

score 9 · Accepted Answer

这不使用 python，但如果这是你想要的唯一任务，python 是错误的工具。您可以使用标准的 *nix 实用程序head，然后运行

head -n-1 filename > newfile

这会将文件名的最后一行以外的所有内容复制到 newfile。

score 7 · Accepted Answer

假设您必须在 Python 中执行此操作，并且您有一个足够大的文件，列表切片还不够，您可以在文件中一次性完成：

last_line = None
for line in file:
    if last_line:
        print last_line # or write to a file, call a function, etc.
    last_line = line

不是世界上最优雅的代码，但它可以完成工作。

基本上它通过 last_line 变量缓冲文件中的每一行，每次迭代输出前一个迭代行。

score 4 · Accepted Answer

这是我为 linux 用户提供的解决方案：

import os 
file_path = 'test.txt'
os.system('sed -i "$ d" {0}'.format(file_path))

无需在 python 中读取和遍历文件。

score 3 · Accepted Answer

在file.truncate()工作的系统上，您可以执行以下操作：

file = open('file.txt', 'rb')
pos = next = 0
for line in file:
  pos = next # position of beginning of this line
  next += len(line) # compute position of beginning of next line
file = open('file.txt', 'ab')
file.truncate(pos)

根据我的测试， file.tell() 在逐行读取时不起作用，可能是由于缓冲混淆了它。这就是为什么这会增加线条的长度以找出位置。请注意，这仅适用于行分隔符以 '\n' 结尾的系统。

score 2 · Accepted Answer

从之前的帖子中得到启发，我提出这个：

with open('file_name', 'r+') as f:
  f.seek(0, os.SEEK_END) 
  while f.tell() and f.read(1) != '\n':
    f.seek(-2, os.SEEK_CUR)
  f.truncate()

score 0 · Accepted Answer

这是另一种方式，无需将整个文件放入内存中

p=""
f=open("file")
for line in f:
    line=line.strip()
    print p
    p=line
f.close()

score 0 · Accepted Answer

虽然我没有测试过它（请不要讨厌它）我相信有一种更快的方法。它更像是一个 C 解决方案，但在 Python 中很有可能。它也不是 Pythonic。这是一个理论，我会说。

首先，您需要知道文件的编码。将变量设置为该编码中字符使用的字节数（ASCII 中的 1 个字节）。CHARsize（为什么不）。可能是 1 个字节的 ASCII 文件。

然后获取文件的大小，将FILEsize设置为它。

假设您在FILEadd中有文件的地址（在内存中）。

将FILEsize添加到FILEadd。

移动 backwords（增加 -1***CHARsize**），测试每个 CHARsize 字节的 \n（或系统使用的任何换行符）。当您到达第一个 \n 时，您现在有了文件第一行开头的位置。将 \n 替换为 \x1a （26，EOF 的 ASCII，或您的系统/带有编码的任何内容）。

根据需要进行清理（更改文件大小，触摸文件）。

如果这像我猜想的那样有效，那么您将节省大量时间，因为您不需要从头开始阅读整个文件，而是从头开始阅读。

score 0 · Accepted Answer

这是一个更通用的内存效率解决方案，允许跳过最后的“n”行（如head命令）：

import collections, fileinput
def head(filename, lines_to_delete=1):
    queue = collections.deque()
    lines_to_delete = max(0, lines_to_delete) 
    for line in fileinput.input(filename, inplace=True, backup='.bak'):
        queue.append(line)
        if lines_to_delete == 0:
            print queue.popleft(),
        else:
            lines_to_delete -= 1
    queue.clear()

python - 用python删除文件中的最后一行

10 回答 10

Related

Reference