2

The question might sound strange because I know I enforce a strange situation> It came up by accident (a bug one might say) and I even know hot to avoid it, so please skip that part.

I would really like to understand the behaviour I see.

The point of the function is to add all files with a given prefix in a directory to an archive. I noticed that even despite a "bug", the program works correctly (sic!). I wanted to understand why.

The code is fairly simple so I allow myself to post whole function:

def pack(prefix, custom_meta_files = []):
  postfix = 'tgz'  
  if prefix[-1] != '.':
    postfix = '.tgz'

  archive = tarfile.open(prefix+postfix, "w:gz")
  files = filter(lambda path: path.startswith(prefix), os.listdir())
  #print('files: {0}'.format(list(files)))

  for file in files:
    print('packing `{0}`'.format(file))
    archive_name = file[len(prefix):]   #skip prefix + dot
    archive.add(file, archive_name)

  not_doubled_metas = set(custom_meta_files) - set(archive.getnames())
  print('metas to add: {0}'.format(not_doubled_metas))
  for meta in not_doubled_metas:
    print('packing `{0}`'.format(meta))
    archive.add(meta)
  print('contents:{0}'.format(archive.getnames()))

As one can notice I create the archive with the prefix, and then I create a list of files to pack by by listing everything in cwd and filter it via the lambda. Naturally the archive passes the filter. There is also a snippet to add fixed files if the names do not overlap, although it is not important I think.

So the output from such run is e.g:

packing `ga_run.seq_niche.N30.1.bt0_5K.params`
packing `ga_run.seq_niche.N30.1.bt0_5K.stats`
packing `ga_run.seq_niche.N30.1.bt0_5K.tgz`
metas to add: {'stats.meta'}
packing `stats.meta`
contents:['params', 'stats', 'stats.meta']

So the script tried adding itself, however it does not appear in the final contents. I do not know what is the expected behaviour, but there is no warning at all and the documentation does not mention anything. I read the parts about methods to add members and used search for itself and same name.

I would assume it is automatically skipped, but I don't know how to acutally check it. I would personally expect to add a zero length file as member, however I understand skipping as I makes more sense actually.

Question Is it a desired behaviour in tarfile.add() to ignore adding the archive to itself? Where is it said?

4

1 回答 1

1

扫描tarfile.py从 3.2 到 2.4 的代码,它们的代码都类似于:

# Skip if somebody tries to archive the archive...
if self.name is not None and os.path.abspath(name) == self.name:
    self._dbg(2, "tarfile: Skipped %r" % name)
    return
于 2013-07-25T01:32:08.157 回答