I need to create an arbitrarily large tarfile for testing but don't want it to hit the disk.
What's the easiest way to do this?
I need to create an arbitrarily large tarfile for testing but don't want it to hit the disk.
What's the easiest way to do this?
You can easily use python to generate such a tarfile:
mktar.py
:
#!/usr/bin/python
import datetime
import sys
import tarfile
tar = tarfile.open(fileobj=sys.stdout, mode="w|")
info = tarfile.TarInfo(name="fizzbuzz.data")
info.mode = 0644
info.size = 1048576 * 16
info.mtime = int(datetime.datetime.now().strftime('%s'))
rand = open('/dev/urandom', 'r')
tar.addfile(info,rand)
tar.close()
michael@challenger:~$ ./mktar.py | tar tvf -
-rw-r--r-- 0/0 16777216 2012-08-02 13:39 fizzbuzz.data
You can use tar with -O option tar -O
, like this tar -xOzf foo.tgz bigfile | process
https://www.gnu.org/software/tar/manual/html_node/Writing-to-Standard-Output.html
PS: However, it could be, that you will not get the benefits you intend to gain as tar starts writing stdout only after it has read through the entire compressed file. You can demonstrate this behavior by starting a large file extraction and following the file size over time; it should be zero most of the processing time and start growing at very late stage. On the other hand I haven't researched this extensively, there might be some work around, or I might be just plain wrong with my first hand out-of-memory experience.