1

我有一个看起来像这样的文件

@FCD17BKACXX:8:1101:2703:2197#0/1
CAGCTTTACTCGTCATTTCCCCCAAGGGTAAAATGCGTCCGTCCATTAAGTTCACAGTCATCGTCT
+FCD17BKACXX:8:1101:2703:2197#0/1
^`^\eggcghheJ`dffhhhffhe`ecd^a^_ceacecfhf\beZegfhh_fghhgfZbdg]c^a`
@FCD17BKACXX:8:1101:4434:2244#0/1
CTGCGTTCATCGCGTTGTTGGGAGGAATCTCTACCCCAGGTTCTCGCTGTGAA
+FCD17BKACXX:8:1101:4434:2244#0/1
eeecgeceeffhhihi_fhhiicdgfghiiihiiihiiihVbcdgfhge`cee
@FCD17BKACXX:8:1101:6394:2107#0/1
CAGCAGGACTAGGGCCTGCAGACGTACTG
+FCD17BKACXX:8:1101:6394:2107#0/1
eeeccggeghhiihiihihihhhhcfghf

我想去每一行计算字符数。如果该行包含少于例如 66 个字符,则使用 'A' 将其填充到 66 并打印到新文件。如果它包含 66 个字符,则按原样打印该行。

输出文件如下所示;

@FCD17BKACXX:8:1101:2703:2197#0/1
CAGCTTTACTCGTCATTTCCCCCAAGGGTAAAATGCGTCCGTCCATTAAGTTCACAGTCATCGTCT
+FCD17BKACXX:8:1101:2703:2197#0/1
^`^\eggcghheJ`dffhhhffhe`ecd^a^_ceacecfhf\beZegfhh_fghhgfZbdg]c^a`
@FCD17BKACXX:8:1101:4434:2244#0/1
CTGCGTTCATCGCGTTGTTGGGAGGAATCTCTACCCCAGGTTCTCGCTGTGAAAAAAAAAAAAAAA
+FCD17BKACXX:8:1101:4434:2244#0/1
eeecgeceeffhhihi_fhhiicdgfghiiihiiihiiihVbcdgfhge`ceeAAAAAAAAAAAAA
@FCD17BKACXX:8:1101:6394:2107#0/1
CAGCAGGACTAGGGCCTGCAGACGTACTGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+FCD17BKACXX:8:1101:6394:2107#0/1
eeeccggeghhiihiihihihhhhcfghfAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

我对 awk 有非常基本的了解,所以从学习的角度来看,我想使用 awk 来解决问题。

4

5 回答 5

4

单程:

awk '!(NR%2) && length<66{for(i=length;i<66;i++)$0=$0 "A"}1' file
于 2013-02-05T12:09:28.370 回答
1
awk 'NR%2 == 0{
    printf("%s", $0)
    for(i=length($0); i<66; i++)printf("A")
    print "";next }
    {print}'
于 2013-02-05T12:07:19.577 回答
1

我会粘贴另一个奇怪的(也许)oneliner:

 awk 'BEGIN{while(++i<66)t=t"A"}!(NR%2){$0=$0substr(t,length)}1' file
于 2013-02-05T12:23:48.930 回答
1

这应该比公认的方法更快:

awk 'NR%2==0 { x = sprintf("%-66s", $0); gsub(/ /,"A",x); $0 = x }1' file

结果:

@FCD17BKACXX:8:1101:2703:2197#0/1
CAGCTTTACTCGTCATTTCCCCCAAGGGTAAAATGCGTCCGTCCATTAAGTTCACAGTCATCGTCT
+FCD17BKACXX:8:1101:2703:2197#0/1
^`^\eggcghheJ`dffhhhffhe`ecd^a^_ceacecfhf\beZegfhh_fghhgfZbdg]c^a`
@FCD17BKACXX:8:1101:4434:2244#0/1
CTGCGTTCATCGCGTTGTTGGGAGGAATCTCTACCCCAGGTTCTCGCTGTGAAAAAAAAAAAAAAA
+FCD17BKACXX:8:1101:4434:2244#0/1
eeecgeceeffhhihi_fhhiicdgfghiiihiiihiiihVbcdgfhge`ceeAAAAAAAAAAAAA
@FCD17BKACXX:8:1101:6394:2107#0/1
CAGCAGGACTAGGGCCTGCAGACGTACTGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+FCD17BKACXX:8:1101:6394:2107#0/1
eeeccggeghhiihiihihihhhhcfghfAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
于 2013-02-06T07:53:42.487 回答
0
awk -v FS= '{printf "%s",$0} !(NR%2){for (i=NF+1;i<=66;i++) printf "A"} {print ""}'

或者如果你不喜欢循环:

awk -v FS= '{sfx=(NR%2 ? "" : sprintf("%*s",66-NF,"")); gsub(/ /,"A",sfx); print $0 sfx}'
于 2013-02-05T13:48:17.127 回答