1

I'm currently trying to write a script that does some post processing after rsync --max-size=4000000000 has done its job, to allow for full backups to FAT32 (which is the only filesystem that is r/w on all of Windows/Mac/*nix)

I am writing in bash for Mac OS X and Linux; currently testing on OS X. Code is here

https://github.com/taikedz/fullsync32/blob/master/fullsync32

The script recurses through directories finding

  • files that have a resource fork (HFS property)
  • files that are larger than 4 GB

and upon finding such files either processes them via tar -cz or split as appropriate, before copying them over.

I use recursion instead of the find utility because of the test for presence of resource fork on a file: it involves checking the size of a special file. Say you have file foo.txt; its resource fork can be found by looking at ls -l foo.txt/..namedfork/rsrc and cheking the length is non-zero.

The basic structure is

recurse() {
  pushd "$1"
    for NODE in *; do
      if [ -d "$NODE" ]; then
        recurse "$NODE"
        continue
      fi
      # (process files here, with calls to split, tar and md5)
    done
  popd
}

recurse ./target/directory

Problem

I ran this against my backups the other day and left it running for a few hours. When I came back I found that my spare 11 GB of RAM had been used up, and it was ~248 MB into swap...

I looked around on Google for issues around bash memory leakage in recursion, and apart from a few tenuously answered forum posts, didn't find much...

The other add result of which (which is Mac specific) is that the "Inactive memory" stays inactive and the system runs slowly... Restart required.

Questions

  • Is such potentially deep recursion with bash a bad idea in itself?
  • is there an ingenious way to iterate rather than recurse in this situation?
  • or am I going about this completely wrong anyways?

You inputs are much appreciated!

4

1 回答 1

2

使用 bash 进行这种潜在的深度递归本身是一个坏主意吗?

Bash 并不意味着递归,但递归到几千级没有问题,这对于通过文件系统进行递归来说绰绰有余。

find但是,与所有语言一样,Bash 无法像您通过上述已证明的循环检测所冒的风险那样对无限深度进行非尾递归。

在这种情况下,有没有一种巧妙的方法来迭代而不是递归?

您可以迭代find输出:

find "$1" -print0 | while IFS= read -d '' -r filename
do
  echo "Operating on $filename"
done

如何使用 find 执行测试

您可以使用 运行任意外部测试-exec,这里调用 bash:

find / -exec bash -c '[[ -s "$1/..namedfork/rsrc" ]]' _ {} \; -print
于 2014-07-08T17:32:32.583 回答