bash - bash shell 脚本查找最近的几个文件的父目录

Question

假设输入参数是几个文件的完整路径。说，

/abc/def/file1
/abc/def/ghi/file2
/abc/def/ghi/file3

如何/abc/def在 bash shell 脚本中获取目录名称？
我怎样才能只获得file1,/ghi/file2和/ghi/file3？

score 5 · Accepted Answer

给定第 1 部分的答案（通用前缀），第 2 部分的答案很简单；您将前缀从每个名称中切掉，这可以通过sed其他选项来完成。

那么，有趣的部分是找到公共前缀。最小公共前缀是/（例如，对于/etc/passwdand /bin/sh）。最大公共前缀（根据定义）存在于所有字符串中，因此我们只需将其中一个字符串拆分为段，并将可能的前缀与其他字符串进行比较。概述：

split name A into components
known_prefix="/"
for each extra component from A
do
    possible_prefix="$known_prefix/$extra/"
    for each name
    do
        if $possible_prefix is not a prefix of $name
        then ...all done...break outer loop...
        fi
    done
    ...got here...possible prefix is a prefix!
    known_prefix=$possible_prefix
done

有一些管理细节需要处理，例如名称中的空格。另外，什么是允许的武器。问题已标记bash，但允许使用哪些外部命令（例如 Perl）？

一个未定义的问题——假设名称列表是：

/abc/def/ghi
/abc/def/ghi/jkl
/abc/def/ghi/mno

是最长的公共前缀/abc/def还是/abc/def/ghi？我将假设这里最长的公共前缀是/abc/def. （如果你真的希望它是/abc/def/ghi，那么使用/abc/def/ghi/.第一个名称。）

此外，还有调用细节：

这个函数或命令是如何调用的？
值是如何返回的？
这是一两个函数或命令（longest_common_prefix和'path_without_prefix`）吗？

两个命令更简单：

prefix=$(longest_common_prefix name1 [name2 ...])
suffix=$(path_without_prefix /pre/fix /pre/fix/to/file [...])

如果前缀存在，该path_without_prefix命令将删除该前缀，如果前缀不以名称开头，则保持参数不变。

最长的公共前缀

longest_common_prefix()
{
    declare -a names
    declare -a parts
    declare i=0

    names=("$@")
    name="$1"
    while x=$(dirname "$name"); [ "$x" != "/" ]
    do
        parts[$i]="$x"
        i=$(($i + 1))
        name="$x"
    done

    for prefix in "${parts[@]}" /
    do
        for name in "${names[@]}"
        do
            if [ "${name#$prefix/}" = "${name}" ]
            then continue 2
            fi
        done
        echo "$prefix"
        break
    done
}

测试：

set -- "/abc/def/file 0" /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3 "/abc/def/ghi/file 4"
echo "Test: $@"
longest_common_prefix "$@"
echo "Test: $@" abc/def
longest_common_prefix "$@" abc/def
set --  /abc/def/ghi/jkl /abc/def/ghi /abc/def/ghi/mno
echo "Test: $@"
longest_common_prefix "$@"
set -- /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3
echo "Test: $@"
longest_common_prefix "$@"
set -- "/a c/d f/file1" "/a c/d f/ghi/file2" "/a c/d f/ghi/file3"
echo "Test: $@"
longest_common_prefix "$@"

输出：

Test: /abc/def/file 0 /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3 /abc/def/ghi/file 4
/abc/def
Test: /abc/def/file 0 /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3 /abc/def/ghi/file 4 abc/def
Test: /abc/def/ghi/jkl /abc/def/ghi /abc/def/ghi/mno
/abc/def
Test: /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3
/abc/def
Test: /a c/d f/file1 /a c/d f/ghi/file2 /a c/d f/ghi/file3
/a c/d f

path_without_prefix

path_without_prefix()
{
    local prefix="$1/"
    shift
    local arg
    for arg in "$@"
    do
        echo "${arg#$prefix}"
    done
}

测试：

for name in /pre/fix/abc /pre/fix/def/ghi /usr/bin/sh
do
    path_without_prefix /pre/fix $name
done

输出：

abc
def/ghi
/usr/bin/sh

score 2 · Accepted Answer

这是一个已被证明可以处理任意复杂文件名（包含换行符、退格等）的文件：

path_common() {
    if [ $# -ne 2 ]
    then
        return 2
    fi

    # Remove repeated slashes
    for param
    do
        param="$(printf %s. "$1" | tr -s "/")"
        set -- "$@" "${param%.}"
        shift
    done

    common_path="$1"
    shift

    for param
    do
        while case "${param%/}/" in "${common_path%/}/"*) false;; esac; do
            new_common_path="${common_path%/*}"
            if [ "$new_common_path" = "$common_path" ]
            then
                return 1 # Dead end
            fi
            common_path="$new_common_path"
        done
    done
    printf %s "$common_path"
}

score 2 · Accepted Answer

一个更“便携”的解决方案，因为它不使用特定于 bash 的功能：首先定义一个函数来计算两个路径的最长公共前缀：

function common_path()
{
  lhs=$1
  rhs=$2
  path=
  OLD_IFS=$IFS; IFS=/
  for w in $rhs; do
    test "$path" = / && try="/$w" || try="$path/$w"
    case $lhs in
      $try*) ;;
      *) break ;;
    esac
    path=$try
  done
  IFS=$OLD_IFS
  echo $path
}

然后将它用于一长串单词：

function common_path_all()
{
  local sofar=$1
  shift
  for arg
  do
    sofar=$(common_path "$sofar" "$arg")
  done
  echo ${sofar:-/}
}

根据您的输入，它给出

$ common_path_all /abc/def/file1 /abc/def/ghi/file2 /abc/def/ghi/file3
/abc/def

正如 Jonathan Leffler 指出的那样，一旦你有了这个，第二个问题就变得微不足道了。

score 1 · Accepted Answer

在我看来，下面的解决方案要简单得多。

如前所述，只有第 1 部分比较棘手。第 2 部分使用 sed 很简单。

第 1 部分可以分为 2 个子部分：

查找所有字符串的最长公共前缀
确保这个前缀是一个目录，如果不修剪它得到对应的目录

可以使用以下代码完成。为了清楚起见，此示例仅使用 2 个字符串，但 while 循环为您提供了 n 个字符串所需的内容。

LONGEST_PREFIX=$(printf "%s\n%s\n" "$file_1" "$file_2" | sed -e 'N;s/^\(.*\).*\n\1.*$/\1/')
CLOSEST_PARENT=$(echo "$LONGEST_PREFIX" | sed 's/\(.*\)\/.*/\1/')

当然可以只用一行重写：

CLOSEST_PARENT=$(printf "%s\n%s\n" "$file_1" "$file_2" | sed -e 'N;s/^\(.*\).*\n\1.*$/\1/'  | sed 's/\(.*\)\/.*/\1/')

score -1 · Accepted Answer

获取家长目录：

  dirname /abc/def/file1

会给/abc/def

并获取文件名

   basename /abc/def/file1

会给file1

并且根据您的问题，仅使用最近的父目录名称

basename $(dirname $(/abc/def/file1))

将在此处给出def 输入代码

bash - bash shell 脚本查找最近的几个文件的父目录

5 回答 5

最长的公共前缀

path_without_prefix

Related

Reference