1

正如标题所说,我有几个文件夹,几个 .ppm.bz2 文件,我想将它们准确地提取到它们使用 python 的位置。

目录结构图片

我在文件夹中遍历是这样的:

 import tarfile
 import os
 path = '/Users/ankitkumar/Downloads/colorferet/dvd1/data/images/'
 folders = os.listdir(path)
 for folder in folders:  #the folders starting like 00001
     if not folder.startswith("0"):
         pass
     path2 = path + folder
     zips = os.listdir(path2)
     for zip in zips:
         if not zip.startswith("0"):
             pass
         path3 = path2+"/"+zip

         fh = tarfile.open(path3, 'r:bz2')
         outpath = path2+"/"
         fh.extractall(outpath)
         fh.close

`

然后我得到这个错误`

Traceback (most recent call last):
  File "ZIP.py", line 16, in <module>
    fh = tarfile.open(path3, 'r:bz2')
  File "/anaconda2/lib/python2.7/tarfile.py", line 1693, in open
    return func(name, filemode, fileobj, **kwargs)
  File "/anaconda2/lib/python2.7/tarfile.py", line 1778, in bz2open
    t = cls.taropen(name, mode, fileobj, **kwargs)
  File "/anaconda2/lib/python2.7/tarfile.py", line 1723, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/anaconda2/lib/python2.7/tarfile.py", line 1587, in __init__
    self.firstmember = self.next()
  File "/anaconda2/lib/python2.7/tarfile.py", line 2370, in next
    raise ReadError(str(e))
tarfile.ReadError: invalid header

`

4

1 回答 1

0

tarfile 模块用于 tar 文件,包括tar.bz2. 如果你的文件不是tar你应该bz2直接使用模块。

另外,尝试使用os.walk而不是多个listdir,因为它可以遍历树

import os
import bz2
import shutil

for path, dirs, files in os.walk(path):
    for filename in files:
        basename, ext = os.path.splitext(filename)
        if ext.lower() != '.bz2':
            continue
        fullname = os.path.join(path, filename)
        newname = os.path.join(path, basename)
        with bz2.open(fullname) as fh, open(newname, 'wb') as fw:
            shutil.copyfileobj(fh, fw)

这将解压缩所有.bz2子文件夹中的所有文件,在它们所在的位置。所有其他文件将保持不变。如果未压缩的文件已经存在,它将被覆盖。

请在运行破坏性代码之前备份您的数据

于 2018-07-24T17:45:41.353 回答