python - 如何在 Python 中创建递增的文件名？

Question

我正在创建一个程序，它将创建一个文件并将其保存到文件名为 sample.xml 的目录中。一旦文件被保存，当我再次尝试运行程序时，它会将旧文件覆盖到新文件中，因为它们确实具有相同的文件名。如何增加文件名，以便每当我尝试再次运行代码时，它都会增加文件名。并且不会覆盖现有的。我正在考虑首先在目录上检查文件名，如果它们相同，代码将生成一个新文件名：

fh = open("sample.xml", "w")
rs = [blockresult]
fh.writelines(rs)
fh.close()

score 61 · Accepted Answer

例如，我将遍历sample[int].xml并获取文件或目录未使用的下一个可用名称。

import os

i = 0
while os.path.exists("sample%s.xml" % i):
    i += 1

fh = open("sample%s.xml" % i, "w")
....

最初应该给你sample0.xml，然后是sample1.xml等。

请注意，默认情况下，相对文件表示法与您从中运行代码的文件目录/文件夹相关。必要时使用绝对路径。用于os.getcwd()读取当前目录并os.chdir(path_to_dir)设置新的当前目录。

score 20 · Accepted Answer

顺序检查每个文件名以查找下一个可用的文件名适用于少量文件，但随着文件数量的增加很快变得更慢。

这是一个在 log(n) 时间内找到下一个可用文件名的版本：

import os

def next_path(path_pattern):
    """
    Finds the next free path in an sequentially named list of files

    e.g. path_pattern = 'file-%s.txt':

    file-1.txt
    file-2.txt
    file-3.txt

    Runs in log(n) time where n is the number of existing files in sequence
    """
    i = 1

    # First do an exponential search
    while os.path.exists(path_pattern % i):
        i = i * 2

    # Result lies somewhere in the interval (i/2..i]
    # We call this interval (a..b] and narrow it down until a + 1 = b
    a, b = (i // 2, i)
    while a + 1 < b:
        c = (a + b) // 2 # interval midpoint
        a, b = (c, b) if os.path.exists(path_pattern % c) else (a, c)

    return path_pattern % b

为了测量速度的提高，我编写了一个创建 10,000 个文件的小测试函数：

for i in range(1,10000):
    with open(next_path('file-%s.foo'), 'w'):
        pass

并实施了幼稚的方法：

def next_path_naive(path_pattern):
    """
    Naive (slow) version of next_path
    """
    i = 1
    while os.path.exists(path_pattern % i):
        i += 1
    return path_pattern % i

结果如下：

快速版本：

real    0m2.132s
user    0m0.773s
sys 0m1.312s

天真的版本：

real    2m36.480s
user    1m12.671s
sys 1m22.425s

最后，请注意，如果多个参与者同时尝试在序列中创建文件，则任何一种方法都容易受到竞争条件的影响。

score 14 · Accepted Answer

def get_nonexistant_path(fname_path):
    """
    Get the path to a filename which does not exist by incrementing path.

    Examples
    --------
    >>> get_nonexistant_path('/etc/issue')
    '/etc/issue-1'
    >>> get_nonexistant_path('whatever/1337bla.py')
    'whatever/1337bla.py'
    """
    if not os.path.exists(fname_path):
        return fname_path
    filename, file_extension = os.path.splitext(fname_path)
    i = 1
    new_fname = "{}-{}{}".format(filename, i, file_extension)
    while os.path.exists(new_fname):
        i += 1
        new_fname = "{}-{}{}".format(filename, i, file_extension)
    return new_fname

在打开文件之前，调用

fname = get_nonexistant_path("sample.xml")

这将给您'sample.xml'或 - 如果这已经存在 -'sample-i.xml'其中 i 是最小的正整数，因此文件不存在。

我建议使用os.path.abspath("sample.xml"). 如果您有~主目录，则可能需要先展开它。

请注意，如果您同时运行多个实例，则此简单代码可能会出现竞争条件。如果这可能是一个问题，请检查这个问题。

score 5 · Accepted Answer

尝试设置一个计数变量，然后递增嵌套在您写入文件的同一个循环内的该变量。在文件名中包含一个转义字符的计数循环，所以每个循环都打+1，数字也是如此文件。

我刚刚完成的项目中的一些代码：

numberLoops = #some limit determined by the user
currentLoop = 1
while currentLoop < numberLoops:
    currentLoop = currentLoop + 1

    fileName = ("log%d_%d.txt" % (currentLoop, str(now())))

以供参考：

from time import mktime, gmtime

def now(): 
   return mktime(gmtime())

这可能与您的情况无关，但我正在运行该程序的多个实例并制作大量文件。希望这可以帮助！

score 2 · Accepted Answer

如果不将状态数据存储在额外的文件中，对于此处介绍的问题，更快的解决方案是执行以下操作：

from glob import glob
import os

files = glob("somedir/sample*.xml")
files = files.sorted()
cur_num = int(os.path.basename(files[-1])[6:-4])
cur_num += 1
fh = open("somedir/sample%s.xml" % cur_num, 'w')
rs = [blockresult]
fh.writelines(rs)
fh.close()

即使某些编号较低的文件消失了，这也将继续增加。

我喜欢的另一个解决方案（由 Eiyrioü 指出）是保留一个包含您最近号码的临时文件的想法：

temp_fh = open('somedir/curr_num.txt', 'r')
curr_num = int(temp_fh.readline().strip())
curr_num += 1
fh = open("somedir/sample%s.xml" % cur_num, 'w')
rs = [blockresult]
fh.writelines(rs)
fh.close()

score 2 · Accepted Answer

避免使用 while 循环的另一种解决方案是使用os.listdir()函数，该函数返回包含在路径作为参数的目录中的所有文件和目录的列表。

要回答问题中的示例，假设您正在工作的目录仅包含从 0 开始索引的“sample_i.xlm”文件，您可以使用以下代码轻松获取新文件的下一个索引。

import os

new_index = len(os.listdir('path_to_file_containing_only_sample_i_files'))
new_file = open('path_to_file_containing_only_sample_i_files/sample_%s.xml' % new_index, 'w')

score 1 · Accepted Answer

两种方法是：

检查旧文件是否存在，如果存在，请尝试下一个文件名 +1
将状态数据保存在某处

一个简单的方法是：

import os.path as pth
filename = "myfile"
filenum = 1
while (pth.exists(pth.abspath(filename+str(filenum)+".py")):
    filenum+=1
my_next_file = open(filename+str(filenum)+".py",'w')

作为一种设计，while True它会减慢速度，并且对于代码可读性来说并不是一件好事

编辑：@EOL 贡献/想法

所以我认为没有 .format 乍一看更具可读性 - 但使用 .format 更适合一般性和约定。

import os.path as pth
filename = "myfile"
filenum = 1
while (pth.exists(pth.abspath(filename+str(filenum)+".py")):
    filenum+=1
my_next_file = open("{}{}.py".format(filename, filenum),'w')
# or 
my_next_file = open(filename + "{}.py".format(filenum),'w')

而且您不必使用 abspath - 如果您愿意，可以使用相对路径，我有时更喜欢 abs 路径，因为它有助于规范化传递的路径:)。

import os.path as pth
filename = "myfile"
filenum = 1
while (pth.exists(filename+str(filenum)+".py"):
    filenum+=1
##removed for conciseness

score 1 · Accepted Answer

另一个使用递归的例子

import os
def checkFilePath(testString, extension, currentCount):
    if os.path.exists(testString + str(currentCount) +extension):
        return checkFilePath(testString, extension, currentCount+1)
    else:
        return testString + str(currentCount) +extension

利用：

checkFilePath("myfile", ".txt" , 0)

score 1 · Accepted Answer

我需要做类似的事情，但对于数据处理管道中的输出目录。我受到 Vorticity 的回答的启发，但添加了使用正则表达式来获取尾随数字。即使删除了中间编号的输出目录，此方法也会继续递增最后一个目录。它还添加了前导零，因此名称将按字母顺序排序（即宽度 3 给出 001 等）

def get_unique_dir(path, width=3):
    # if it doesn't exist, create
    if not os.path.isdir(path):
        log.debug("Creating new directory - {}".format(path))
        os.makedirs(path)
        return path

    # if it's empty, use
    if not os.listdir(path):
        log.debug("Using empty directory - {}".format(path))
        return path

    # otherwise, increment the highest number folder in the series

    def get_trailing_number(search_text):
        serch_obj = re.search(r"([0-9]+)$", search_text)
        if not serch_obj:
            return 0
        else:
            return int(serch_obj.group(1))

    dirs = glob(path + "*")
    num_list = sorted([get_trailing_number(d) for d in dirs])
    highest_num = num_list[-1]
    next_num = highest_num + 1
    new_path = "{0}_{1:0>{2}}".format(path, next_num, width)

    log.debug("Creating new incremented directory - {}".format(new_path))
    os.makedirs(new_path)
    return new_path

get_unique_dir("output")

score 1 · Accepted Answer

您可以使用带有计数器的 while 循环来检查具有名称和计数器值的文件是否存在，如果确实存在则继续操作 else 中断并创建文件。

我已经为我的一个项目这样做了：`

from os import path
import os

i = 0
flnm = "Directory\\Filename" + str(i) + ".txt"
while path.exists(flnm) :
    flnm = "Directory\\Filename" + str(i) + ".txt"
    i += 1
f = open(flnm, "w") #do what you want to with that file...
f.write(str(var))
f.close() # make sure to close it.

`

这里的计数器 i 从 0 开始，并且每次都有一个 while 循环检查文件是否存在，如果存在，它会继续移动，否则它会中断并创建一个文件，然后您可以自定义。还要确保关闭它，否则它会导致文件被打开，这可能会在删除它时导致问题。我使用 path.exists() 检查文件是否存在。当我们使用 open() 方法时不要这样做from os import *会导致问题，因为还有另一个 os.open() 方法，它会给出错误。TypeError: Integer expected. (got str) 否则祝你新年快乐，祝大家新年快乐。

score 0 · Accepted Answer

这是另一个例子。代码测试目录中是否存在文件，如果存在，它确实在文件名的最后一个索引中增加并保存典型的文件名是：month_date_lastindex.txt 的三个字母 ie.egMay10_1.txt

import time
import datetime
import shutil
import os
import os.path


da=datetime.datetime.now()

data_id =1
ts = time.time()
st = datetime.datetime.fromtimestamp(ts).strftime("%b%d")
data_id=str(data_id)
filename = st+'_'+data_id+'.dat'
while (os.path.isfile(str(filename))):
    data_id=int(data_id)
    data_id=data_id+1
    print(data_id)
    filename = st+'_'+str(data_id)+'.dat'
    print(filename)


shutil.copyfile('Autonamingscript1.py',filename)

f = open(filename,'a+')
f.write("\n\n\n")
f.write("Data comments: \n")


f.close()

score 0 · Accepted Answer

从给定的文件名继续序列编号，有或没有附加的序列号。

如果给定的文件名不存在，则将使用它，否则将应用序列号，并且数字之间的间隙将成为候选。

如果给定的文件名尚未排序或者是顺序最高编号的预先存在文件，则此版本很快。

例如，提供的文件名可以是

示例.xml
示例 1.xml
示例 23.xml

import os
import re

def get_incremented_filename(filename):
    name, ext = os.path.splitext(filename)
    seq = 0
    # continue from existing sequence number if any
    rex = re.search(r"^(.*)-(\d+)$", name)
    if rex:
        name = rex[1]
        seq = int(rex[2])
    
    while os.path.exists(filename):
        seq += 1
        filename = f"{name}-{seq}{ext}"
    return filename

score -1 · Accepted Answer

我的 2 美分：一个不断增加的 macOS 风格的增量命名程序

get_increased_path("./some_new_dir").mkdir()创建./some_new_dir；然后
get_increased_path("./some_new_dir").mkdir()创建./some_new_dir (1)；然后
get_increased_path("./some_new_dir").mkdir()创建./some_new_dir (2)；等等

如果./some_new_dir (2)存在但不 ./some_new_dir (1)存在，则无论如何get_increased_path("./some_new_dir").mkdir()都会创建./some_new_dir (3)，以便索引始终增加并且您始终知道哪个是最新的

from pathlib import Path
import re

def get_increased_path(file_path):
    fp = Path(file_path).resolve()
    f = str(fp)

    vals = []
    for n in fp.parent.glob("{}*".format(fp.name)):
        ms = list(re.finditer(r"^{} \(\d+\)$".format(f), str(n)))
        if ms:
            m = list(re.finditer(r"\(\d+\)$", str(n)))[0].group()
            vals.append(int(m.replace("(", "").replace(")", "")))
    if vals:
        ext = " ({})".format(max(vals) + 1)
    elif fp.exists():
        ext = " (1)"
    else:
        ext = ""

    return fp.parent / (fp.name + ext + fp.suffix)

python - 如何在 Python 中创建递增的文件名？

13 回答 13

Related

Reference