bash - bash 自动处理文件

Question

我需要将一个大的排序文件分成更小的块，每个文件都包含一个排序的人名列表。现在我要保证同名的人不会出现在两个文件中，例如，

File1:
.
.
James
James
Kobe

File2:
Kobe
Nash
Nash
.
.

我需要做到

File1:
.
.
James
James
Kobe
Kobe

File2:
Nash
Nash
.
.

以前我使用 sed 手动执行此操作。现在我想编写一个 bash 脚本来自动执行此操作，但不熟悉 bash .. 有什么帮助吗？

score 1 · Accepted Answer

您需要将“当前”文件的最后一行与“下一个”文件的第一行进行比较。我假设您的文件名为“File1，File2，... FileN”。这是未经测试的。

n=1
while true; do
    current=File$n
    next=File$((++n)) 
    if [[ ! -f $next ]]; then
        break
    fi
    last=$(tail -1 "$current")
    first=$(head -1 "$next")
    while [[ $last == $first ]]; do
        echo "$last" >> "$current"    # append the name to the end of the current
        sed -i 1d "$next"             # remove the first line of the next file
        first=$(head -1 "$next")
    done
done

这可能有点慢，因为您可能会反复从下一个文件中删除一行。这可能会更快一些：再次，未经测试。

n=1
while true; do
    current=File$n
    next=File$((++n)) 
    if [[ ! -f $next ]]; then
        break
    fi
    last=$(tail -1 "$current")
    first=$(head -1 "$next")
    num=$(awk -v line="$last" -v N=0 '$0 == line {N++; next} {print N; exit}' "$next")
    if (( num > 0 )); then
        for (( i=1; i<=num; i++ )); do
            echo "$last" >> "$current"
        done
        sed -i "1,$Nd" "$next"
    fi
done

bash - bash 自动处理文件

1 回答 1

Related

Reference