1

我有 349 个文本文件。我使用以下代码来读取和标记所有这些。

import glob
path = "C:\\texts\\*.txt"
for file in files:
   with open (file) as in_file, open ("C:\\texts\\file_tokens.txt", 'w') as out_file:
       for line in in_file:
           words = line.split()
           for word in words:
               out_file.write(word)
               out_file.write("\n")

此代码将结果(所有令牌)保存在一个文件(file_tokens.txt)中。如何将每个文件的令牌保存在新的 .txt 文件中?我的意思是我想要输出 349 个文件,因为每个文件都包含每个文件的标记。

4

2 回答 2

1
from os import path
base_path = "C:\\texts\\*.txt"  #RENAMED
for file in files:
    with open (file) as in_file:
        with open(path.join(base_path,"%s_tokenized.txt" % file)) as out_file:  #ATTENTION
            for line in in_file:
                words = line.split()
                for word in words:
                out_file.write(word)
                out_file.write("\n")

您创建一个名称特定于您正在处理的当前文件的新文件。在这个例子中它是($file_name)_tokenized.txt.

path.join用于将文件输出到正确的目录。IE

>>> path.join("~/Documents","out.txt")
'~/Documents/out.txt'
于 2013-11-05T05:33:16.353 回答
0

为每个输出文件指定不同的名称。

于 2013-11-05T05:32:02.390 回答