如果不使用 OFS 的值作为分隔符重新编译记录,则无法将值分配给字段。相反,使用正则表达式来描述整个记录并替换您关心的字段所在的记录部分。例如使用 GNU awk(在其他 awk 中 - 使用 match()/substr() 和 [[:space:]]):
$ cat foo
foo bar quux # single space, single tab
foo bar quux # single space, double space, triple space
$ awk '{ print gensub(/^(\s*(\S+\s+){1})\S+(.*)/,"\\1blah\\3","") }' foo
foo blah quux # single space, single tab
foo blah quux # single space, double space, triple space
更改1
in{1}
以适应您要替换的字段之前的许多字段:
$ awk '{ print gensub(/^(\s*(\S+\s+){2})\S+(.*)/,"\\1blah\\3","") }' foo
foo bar blah # single space, single tab
foo bar blah # single space, double space, triple space
$ awk '{ print gensub(/^(\s*(\S+\s+){3})\S+(.*)/,"\\1blah\\3","") }' foo
foo bar quux blah single space, single tab
foo bar quux blah single space, double space, triple space
gawk 还包含一个名为 patsplit() 的函数,它的工作方式与 split() 类似,但它不仅将字段存储在结果字符串中,它还将字段之间的空格存储在第二个数组中,因此您可以在这些数组上使用循环来获取如果更清楚,原始空间:
$ awk '{ nf = patsplit($0,fld,/\S+/,sep); fld[2]="blah"; for (i=1;i<=nf;i++) printf "%s%s", sep[i-1], fld[i]; print "" }' foo
foo blah quux # single space, single tab
foo blah quux # single space, double space, triple space
$ awk '{ nf = patsplit($0,fld,/\S+/,sep); fld[3]="blah"; for (i=1;i<=nf;i++) printf "%s%s", sep[i-1], fld[i]; print "" }' foo
foo bar blah # single space, single tab
foo bar blah # single space, double space, triple space
以下是 patsplit() 如何分解每条记录:
$ awk '{ nf = patsplit($0,fld,/\S+/,sep); print "\n" $0; for (i=0;i<=nf;i++) print "<" i ":" fld[i]
":" sep[i] ">" }' foo
foo bar quux # single space, single tab
<0::>
<1:foo: >
<2:bar: >
<3:quux: >
<4:#: >
<5:single: >
<6:space,: >
<7:single: >
<8:tab:>
foo bar quux # single space, double space, triple space
<0:: >
<1:foo: >
<2:bar: >
<3:quux: >
<4:#: >
<5:single: >
<6:space,: >
<7:double: >
<8:space,: >
<9:triple: >
<10:space:>