2

我需要帮助从文件中提取行的某些部分。

这是我的文件的样子:

testfile.txt
This is a test line 1 $#%#
This is a test line 2 $#%#
This is a test line 3 $#%#
This is a test line 4 $#%#
This is a test line 5 $#%#
This is a test line 6 $#%#
This is a test line 7 $#%#

这是我的 bash 脚本:

#!/bin/bash

while read line
do
#echo $line
FilterString=${line:22:26}
echo $FilterString>>testfile2.txt
done<testfile.txt

上面的脚本获取字符串$#%#并写入临时文件

我的问题:

而不是写字符串,$#%#我想要除字符串之外的所有内容$#%#写入文件。所以我希望我的最终输出文件看起来像:

testfile.txt
This is a test line 1 
This is a test line 2 
This is a test line 3 
This is a test line 4 
This is a test line 5 
This is a test line 6 
This is a test line 7 

还请建议我做这件事的最佳工具

提前致谢。

4

6 回答 6

5

如果它只是您要删除的最后一个字段,您可以使用awk

$ awk 'NF=NF-1' file
This is a test line 1
This is a test line 2
This is a test line 3
This is a test line 4
This is a test line 5
This is a test line 6
This is a test line 7

它减少了一个字段的数量,因此不考虑最后一个。

然后,它会执行默认操作awkthat is {print $0}

要重定向到文件,请使用awk 'NF=NF-1' file > new_file.


更新

根据你的评论

在我的情况下,它并不总是最后一个字段,它也可能位于其他字段之间,但是在预定义的位置(始终是固定位置)。

然后您可以使用以下awk语法:

awk -v c=col_num '{$(c)=""}1' file

哪里col_num可以手动设置,如:

$ awk -v c=3 '{$(c)=""}1' file
This is  test line 1 $#%#
This is  test line 2 $#%#
This is  test line 3 $#%#
This is  test line 4 $#%#
This is  test line 5 $#%#
This is  test line 6 $#%#
This is  test line 7 $#%#
$ awk -v c=5 '{$(c)=""}1' file
This is a test  1 $#%#
This is a test  2 $#%#
This is a test  3 $#%#
This is a test  4 $#%#
This is a test  5 $#%#
This is a test  6 $#%#
This is a test  7 $#%#

您也可以cut像这样使用,省略要跳过的字段:

$ cut -d' ' -f1,2,3,4,5,6 file
This is a test line 1
This is a test line 2
This is a test line 3
This is a test line 4
This is a test line 5
This is a test line 6
This is a test line 7

$ cut -d' ' -f1,2,3,5,6,7 file
This is a line 1 $#%#
This is a line 2 $#%#
This is a line 3 $#%#
This is a line 4 $#%#
This is a line 5 $#%#
This is a line 6 $#%#
This is a line 7 $#%#
于 2013-10-28T09:15:11.913 回答
2

通过说:

FilterString=${line:22:26}

选择打印$#%#线条的各个部分。

你可以说:

FilterString=${line:0:21}

打印行的所需部分。或者,您可以说:

FilterString=${line//\$#%#/}

(注意$符号需要转义)


使用sed,您可以说:

sed 's/ $#.*//g' testfile.txt

提供-i选项sed将使更改就地

sed -i 's/ $#.*//g' testfile.txt

根据您的评论,如果您想从文件中的固定位置删除文本,使用cut可能会简化事情。说:

cut -b1-21,27- testfile.txt

22-26将从文件中的所有行中删除字节(包括) testfile.txt

于 2013-10-28T09:19:57.170 回答
1

Instead of writing the string "$#%#" i want everything except string "$#%#" written to file.

可以使用 sed inline 来完成:

sed -i.bak 's/ *\$#%#//g' testfile.txt
于 2013-10-28T09:30:59.340 回答
1

你非常接近:

FilterString=${line:0:22}

或者只是过滤垃圾:

FilterString=${line% \$#%#}
于 2013-10-28T09:40:09.703 回答
1

尝试一下:

#!/bin/sh

while read line
do
#echo $line
FilterString=`python -c "s='$line';print s[:s.find('$')]"`
echo $FilterString>>testfile2.txt`

此示例可以使用各种长度。例如文件上下文:

...
This is a test line 6 $#%#
This is a test line 1024 $#%#
...

你会得到下一个结果:

This is a test line 6
This is a test line 1024
于 2013-10-28T09:44:39.207 回答
0

感谢您的所有答案,伙计们:

将使用基于@devnull 答案的脚本:

#!/bin/bash
while read line
do
#echo $line
#FilterString=${line:22:26}
echo $line | cut -b1-20,27- >>testfile2.txt
done<testfile

因此,如果文件看起来像

testfile.txt
This is a test line 1 $#%# more text
This is a test line 2 $#%# more text
This is a test line 3 $#%# more text
This is a test line 4 $#%# more text
This is a test line 5 $#%# more text
This is a test line 6 $#%# more text
This is a test line 7 $#%# more text

那么输出将是:

testfile2.txt
This is a test line  more text
This is a test line  more text
This is a test line  more text
This is a test line  more text
This is a test line  more text
This is a test line  more text
This is a test line  more text

这正是我想要的

于 2013-10-28T10:19:20.527 回答