我有一个包含大文本文件的文件夹。每个文件是由 [[文件名]]分隔的 1000 个文件的集合。我想拆分文件并从中制作 1000 个文件并将它们放在一个新文件夹中。bash有没有办法做到这一点?任何其他快速方法也可以。
for f in $(find . -name '*.txt')
do mkdir $f
mv
cd $f
awk '/[[.*]]/{g++} { print $0 > g".txt"}' $f
cd ..
done
虽然不是 awk 和一个喝醉的人写的,但不能保证工作。
import re
import sys
def main():
pattern = re.compile(r'\[\[(.+)]]')
with open (sys.argv[1]) as f:
for line in f:
m = re.search(pattern, line)
if m:
try:
with open(fname, 'w+') as g:
g.writelines(lines)
except NameError:
pass
fname = m.group(1)
lines = []
else:
lines.append(line)
with open(fname, 'w+') as g:
g.writelines(lines)
if __name__ == '__main__':
main()
您正在尝试创建一个与现有文件同名的文件夹。
for f in $(find . -name '*.txt')
do mkdir $f
在这里,“查找”将列出当前路径中的文件,并且您将尝试为每个文件创建一个名称完全相同的目录。一种方法是首先创建一个临时文件夹:
for f in $(find . -name '*.txt')
do mkdir temporary # create a temporary folder
mv $f temporary # move the file into the folder
mv temporary $f # rename the temporary folder to the name of the file
cd $f # enter the folder and go on....
awk '/[[.*]]/{g++} { print $0 > g".txt"}' $f
cd ..
done
请注意,您的所有文件夹都将具有“.txt”扩展名。如果你不想这样,你可以在创建文件夹之前把它剪掉;这样,您将不需要临时文件夹,因为您尝试创建的文件夹与 .txt 文件的名称不同。例子:
for f in $(find . -name '*.txt' | rev | cut -b 5- | rev)
编写一个 bash 脚本。在这里,我已经为你完成了。
注意这个脚本的结构和特点:
usage()
用于-h
选项的函数中的作用。-h
, -n
, -v
.getopts
做期权处理$verbose
如果处于模式,请向用户解释正在发生的事情。-n
用于$norun
模式的选项)。run
函数,它注意$norun
、$verbose
和$quiet
变量。我不只是为你钓鱼,而是教你如何钓鱼。
祝你的下一个 bash 脚本好运。
艾伦·S。
#!/bin/bash
# split-collections IN-FOLDER OUT-FOLDER
PROG="${0##*/}"
usage() {
cat 1>&2 <<EOF
usage: $PROG [OPTIONS] IN-FOLDER OUT-FOLDER
This script splits a collection of files within IN-FOLDER into
separate, named files into the given OUT-FOLDER. The created file
names are obtained from formatted text headers within the input
files.
The format of each input file is a set of HEADER and BODY pairs,
where each HEADER is a text line formatted as:
[[input-filename1]]
text line 1
text line 2
...
[[input-filename2]]
text line 1
text line 2
...
Normal processing will show the filenames being read, and file
names being created. Use the -v (verbose) option to show the
number of text lines being written to each created file. Use
-v twice to show the actual lines of text being written.
Use the -n option to show what would be done, without actually
doing it.
Options
-h Show this help
-n Dry run -- do NOT create any files or make any changes
-o Overwrite existing output files.
-v Be verbose
EOF
exit
}
talk() { echo 1>&2 "$@" ; }
chat() { [[ -n "$norun$verbose" ]] && talk "$@" ; }
nvtalk() { [[ -n "$verbose" ]] || talk "$@" ; }
qtalk() { [[ -n "$quiet" ]] || talk "$@" ; }
nrtalk() { talk "${norun:+(norun) }$@" ; }
error() {
local code=2
case "$1" in [0-9]*) code=$1 ; shift ;; esac
echo 1>&2 "$@"
exit $code
}
talkf() { printf 1>&2 "$@" ; }
chatf() { [[ -n "$norun$verbose" ]] && talkf "$@" ; }
nvtalkf() { [[ -n "$verbose" ]] || talkf "$@" ; }
qtalkf() { [[ -n "$quiet" ]] || talkf "$@" ; }
nrtalkf() { talkf "${norun:+(norun) }$@" ; }
errorf() {
local code=2
case "$1" in [0-9]*) code=$1 ; shift ;; esac
printf 1>&2 "$@"
exit $code
}
# run COMMAND ARGS ...
qrun() {
( quiet=1 run "$@" )
}
run() {
if [[ -n "$norun" ]]; then
if [[ -z "$quiet" ]]; then
nrtalk "$@"
fi
else
if [[ -n "$verbose" ]]; then
talk ">> $@"
fi
if ! eval "$@" ; then
local code=$?
return $code
fi
fi
return 0
}
show_line() {
talkf "%s:%d: %s\n" "$in_file" "$lines_in" "$line"
}
# given an input filename, read it and create
# the output files as indicated by the contents
# of the text in the file
split_collection() {
in_file="$1"
out_file=
lines_in=0
lines_out=0
skipping=
while read line ; do
: $(( lines_in++ ))
[[ $verbose_count > 1 ]] && show_line
# if a line with the format of "[[foo]]" occurs,
# close the current output file, and open a new
# output file called "foo"
if [[ "$line" =~ ^\[\[[[:blank:]]*([^ ]+.*[^ ]|[^ ])[[:blank:]]*\]\][[:blank:]]*$ ]] ; then
new_file="${BASH_REMATCH[1]}"
# close out the current file, if any
if [[ "$out_file" ]]; then
nrtalkf "%d lines written to %s\n" $lines_out "$out_file"
fi
# check the filename for bogosities
case "$new_file" in
*..*|*/*)
[[ $verbose_count < 2 ]] && show_line
error "Badly formatted filename"
;;
esac
out_file="$out_folder/$new_file"
if [[ -e "$out_file" ]]; then
if [[ -n "$overwrite" ]]; then
nrtalk "Overwriting existing '$out_file'"
qrun "cat /dev/null >'$out_file'"
else
error "$out_file already exists."
fi
else
nrtalk "Creating new output file: '$out_file' ..."
qrun "touch '$out_file'"
fi
lines_out=0
elif [[ -z "$out_file" ]]; then
# apparently, there are text lines before the filename
# header; ignore them (out loud)
if [[ ! "$skipping" ]]; then
talk "Text preceding first filename ignored.."
skipping=1
fi
else # next line of input for the file
qrun "echo \"$line\" >>'$out_file'"
: $(( lines_out++ ))
fi
done
}
norun=
verbose=
verbose_count=0
overwrite=
quiet=
while getopts 'hnoqv' opt ; do
case "$opt" in
h) usage ;;
n) norun=1 ;;
o) overwrite=1 ;;
q) quiet=1 ;;
v) verbose=1 ; : $(( verbose_count++ )) ;;
esac
done
shift $(( OPTIND - 1 ))
in_folder="${1:?Missing IN-FOLDER; see $PROG -h for details}"
out_folder="${2:?Missing OUT-FOLDER; see $PROG -h for details}"
# validate the input and output folders
#
# It might be reasonable to create the output folder for the
# user, but that's left as an exercise for the user.
in_folder="${in_folder%/}" # remove trailing slash, if any
out_folder="${out_folder%/}"
[[ -e "$in_folder" ]] || error "$in_folder does not exist"
[[ -d "$in_folder" ]] || error "$in_folder is not a directory."
[[ -e "$out_folder" ]] || error "$out_folder does not exist."
[[ -d "$out_folder" ]] || error "$out_folder is not a directory."
for collection in $in_folder/* ; do
talk "Reading $collection .."
split_collection "$collection" <$collection
done
exit