bash - 基于 bash 中的扩展有效地移动 50 万个文件

Question

设想：

随着 Locky 病毒的肆虐，我工作的计算机中心发现文件恢复的唯一方法是使用像 Recuva 这样的工具，现在的问题是将所有恢复的文件转储到一个目录中。我想将所有基于文件扩展名的文件移动到类别中。所有 JPG 在一个所有 BMP 在另一个......等等。我环顾了 Stackoverflow 并基于各种其他问题和响应，我设法构建了一个小的 bash 脚本（提供了示例），它可以做到这一点，但是它需要永远完成和我想我的扩展搞砸了。

代码：

#!/bin/bash
path=$2   # Starting path to the directory of the junk files
var=0     # How many records were processed
SECONDS=0 # reset the clock so we can time the event

clear

echo "Searching $2 for file types and then moving all files into grouped folders."

# Only want to move Files from first level as Directories are ok were they are
for FILE in `find $2 -maxdepth 1 -type f`
do
  # Split the EXT off for the directory name using AWK
  DIR=$(awk -F. '{print $NF}' <<<"$FILE")
  # DEBUG ONLY
  # echo "Moving file: $FILE into directory $DIR"
  # Make a directory in our path then Move that file into the directory
  mkdir -p "$DIR"
  mv "$FILE" "$DIR"
  ((var++))
done

echo "$var Files found and orginized in:"
echo "$(($diff / 3600)) hours, $((($diff / 60) % 60)) minutes and $(($diff % 60)) seconds."

问题：

在处理超过 500,000 个文件时，如何提高效率？查找需要永远获取文件列表，并在循环中尝试创建目录（即使该路径已经存在）。如果可能的话，我想更有效地处理循环的这两个特定方面。

score 2 · Accepted Answer

任何 bash 脚本的瓶颈通常是您启动的外部进程的数量。mv在这种情况下，您可以通过认识到您要移动的大部分文件将具有共同的后缀（例如，等等）来大大减少对您的调用次数jpg。从这些开始。

for ext in jpg mp3; do
    mkdir -p "$ext"
    # For simplicity, I'll assume your mv command supports the -t option
    find "$2" -maxdepth 1 -name "*.$ext" -exec mv -t "$ext" {} +
done

使用-exec mv -t "$ext" {} +方法find将尽可能多的文件传递给每次调用mv. 对于每个分机，这意味着一次呼叫find和最少呼叫次数mv。

移动这些文件后，您就可以开始一次分析一个文件。

for f in "$2"/*; do
    ext=${f##*.}
    # Probably more efficient to check in-shell if the directory
    # already exists than to start a new process to make the check
    # for you.
    [[ -d $ext ]] || mkdir "$ext"
    mv "$f" "$ext"
done

权衡发生在决定您要预先确定公共扩展以最小化第二个for循环的迭代次数时要执行多少工作。

bash - 基于 bash 中的扩展有效地移动 50 万个文件

设想：

代码：

问题：

1 回答 1

Related

Reference