bash - awk - split only by first occurrence

Question

I have a line like:

one:two:three:four:five:six seven:eight

and I want to use awk to get $1 to be one and $2 to be two:three:four:five:six seven:eight

I know I can get it by doing sed before. That is to change the first occurrence of : with sed then awk it using the new delimiter.

However replacing the delimiter with a new one would not help me since I can not guarantee that the new delimiter will not already be somewhere in the text.

I want to know if there is an option to get awk to behave this way

So something like:

awk -F: '{print $1,$2}'

will print:

one two:three:four:five:six seven:eight

I will also want to do some manipulations on $1 and $2 so I don't want just to substitute the first occurrence of :.

score 28 · Accepted Answer

没有任何替换

echo "one:two:three:four:five" | awk -F: '{ st = index($0,":");print $1 "  " substr($0,st+1)}'

index 命令在整个字符串中查找“：”的第一次出现，因此在这种情况下，变量 st 将设置为 4。然后我使用 substr 函数从位置 st+1 开始抓取字符串的所有其余部分，如果没有提供结束编号，它将转到字符串的末尾。输出是

one  two:three:four:five

如果要进行进一步处理，您始终可以将字符串设置为变量以进行进一步处理。

rem = substr($0,st+1)

请注意，这是在 Solaris AWK 上测试过的，但我看不出有任何理由说明这不适用于其他版本。

score 7 · Accepted Answer

有的这样？

echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' 
one two:three:four:five:six

这取代了第一个:空间。然后你可以稍后把它变成 1 美元，2 美元

echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' | awk '{print $1,$2}'
one two:three:four:five:six

或者在同一个 awk 中，所以即使有替换，你也能以你喜欢的方式得到 1 美元和 2 美元

echo "one:two:three:four:five:six" | awk '{sub(/:/," ");$1=$1;print $1,$2}'
one two:three:four:five:six

编辑：使用不同的分隔符，您可以首先获得one归档$1并像这样休息$2：

echo "one:two:three:four:five:six seven:eight" | awk -F\| '{sub(/:/,"|");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight

独特的分隔符

echo "one:two:three:four:five:six seven:eight" | awk -F"#;#." '{sub(/:/,"#;#.");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight

score 2 · Accepted Answer

您可以使用的最接近的是 GNU awk FPAT：

$ awk '{print $1}' FPAT='(^[^:]+)|(:.*)' file
one

$ awk '{print $2}' FPAT='(^[^:]+)|(:.*)' file
:two:three:four:five:six seven:eight

但$2将包括前导分隔符，但您可以使用它substr来解决这个问题：

$ awk '{print substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
two:three:four:five:six seven:eight

所以把它们放在一起：

$ awk '{print $1, substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight

将结果存储在substr后面将允许在没有前导分隔符的情况下$2进行进一步处理：$2

$ awk '{$2=substr($2,2); print $1,$2}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight

应该使用的解决方案mawk 1.3.3：

awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1}' FS='\0'
one

awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $2}' FS='\0'
two:three:four five:six:seven

awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1,$2}' FS='\0'
one two:three:four five:six:seven

score 0 · Accepted Answer

只是把它放在这里作为我想出的解决方案，我想在其中拆分前两列，:但保持其余行不变。

内联评论。

echo "a:b:c:d::e" | \
  awk '{
    split($0,f,":");           # split $0 into array of fields `f`
    sub(/^([^:]+:){2}/,"",$0); # remove first two "fields" from `$0` 
    print f[1],f[2],$0         # print first two elements of `f` and edited `$0`
  }'

回报：

a b c:d::e

在我的输入中，我不必担心包含 escaped 的前两个字段:，如果这是必需的，则此解决方案将无法按预期工作。

修改以匹配原始要求：

echo "a:b:c:d::e" | \
  awk '{
    split($0,f,":");
    sub(/^([^:]+:)/,"",$0);
    print f[1],$0
  }'

回报：

a b:c:d::e

bash - awk - split only by first occurrence

4 回答 4

Related

Reference