bash - awk/gsub - 每行多次打印双引号之间的所有内容

Question

我试图打印双引号 ( sampleField="sampleValue") 之间的所有数据，但无法获取awk和/或sub/gsub返回双引号之间的所有数据实例。然后，我想在找到它们的相应行上打印所有实例，以将数据保持在一起。

这是该input.txt文件的示例：

deviceId="1300", deviceName="router 13", deviceLocation="Corp"
deviceId="2000", deviceName="router 20", deviceLocation="DC1"

我正在寻找的输出是：

"1300", "router 13", "Corp"
"2000", "router 20", "DC1"

我在使用 gsub 删除 a,和=. 每次我尝试不同的方法时，它总是只返回第一个字段并移动到下一行。

更新：

我忘了提到我不知道每行会有多少双引号封装的字段。它可以是 1、3 或 5,000。不确定这是否会影响解决方案，但想确保它在那里。

score 5 · Accepted Answer

一个sed解决方案：

sed -r 's/[^\"]*([\"][^\"]*[\"][,]?)[^\"]*/\1 /g'
    <<< 'deviceId="1300", deviceName="router 13", deviceLocation="Corp"'

输出：

"1300", "router 13", "Corp"

或者对于一个文件：

sed -r 's/[^\"]*([\"][^\"]*[\"][,]?)[^\"]*/\1 /g' input.txt

score 2 · Accepted Answer

awk -F '"' '{printf(" %c%s%c, %c%s%c, %c%s%c\n", 34,$2, 34, 34, $4,34, $6, 34) } ' \
    input file > newfile

是另一种更简单的方法，使用引号作为字段分隔符。

awk 'BEGIN{ t=sprintf("%c", 34)}
     { for(i=1; i<=NF; i++){
        if(index($i,t) ){print $i}  }; printf("\n")}'  infile > outfile

更通用的 awk 方法。

score 1 · Accepted Answer

awk -F \" '
    {
        sep=""
        for (i=2; i<=NF; i+=2) {
            printf "%s\"%s\"", sep, $i
            sep=", "
        }
        print ""
    }
' << END
deviceId="1300", deviceName="router 13", deviceLocation="Corp", foo="bar"
deviceId="2000", deviceName="router 20", deviceLocation="DC1"
END

输出

"1300", "router 13", "Corp", "bar"
"2000", "router 20", "DC1"

score 1 · Accepted Answer

awk/sub/gsub/ 可能既不是最直接的方法，也不是最简单的方法。当它们有意义时，我喜欢单线：

(1) 在 Perl 中：

172-30-3-163:ajax vphuvan$ perl -pe 's/device.*?=//g' input.txt
"1300", "router 13", "Corp"
"2000", "router 20", "DC1"

where 
-p means "print to screen"
-e means execute the statement between the single quotes
s is a regular expression command which gives the instruction to substitute
g is the switch for the regular expression. /g instructs the program to carry out the substitution /device.*?=// wherever applicable
/device.*?=// is an instruction to replace with an empty string '' any expression that starts with the prefix "device" and that ends just before the closest "=" sign. Note that "deviceId", "deviceName"  and "deviceLocation" all start with the prefix "device" and each of them ends just before the "=" sign

(2) 在 bash 中：

172-30-3-163:ajax vphuvan$ sed "s/deviceId=//; s/deviceName=//; s/deviceLocation=//" input.txt
"1300", "router 13", "Corp"
"2000", "router 20", "DC1"

在这种情况下，我们指示sed连续运行三个替换指令，其中 "deviceId"、"deviceName" 和 "deviceLocation" 分别替换为空字符串 ''

不幸的是，sed（以及 sub 和 gsub）对正则表达式的支持比 Perl 弱得多，Perl 是完全正则表达式支持的黄金标准。特别是sed和 sub/gsub 都不支持非贪婪指令“？”，这个失败让我的生活变得相当复杂。

score 0 · Accepted Answer

试试这个

awk -F\" '{ for(i=2; i<=NF; i=i+2){ a = a"\""$i"\""",\t";} {print a; a="";}}' temp.txt

输出

"1300",  "router 13",     "Corp"
"2000",  "router 20",     "DC1"

score 0 · Accepted Answer

这为时已晚，但一个可能的简单解决方案是：

 $ awk -F"=|," '{print $2,$4,$6}' input.txt
"1300" "router 13" "Corp"
"2000" "router 20" "DC1"

bash - awk/gsub - 每行多次打印双引号之间的所有内容

6 回答 6

Related

Reference