json - 在没有尾随分隔符的情况下分隔 AWK 中的输出记录

Question

我有以下记录：

31 Stockholm
42 Talin
34 Helsinki
24 Moscow
15 Tokyo

我想用 AWK 把它转换成 JSON。使用此代码：

#!/usr/bin/awk
BEGIN {
    print "{";
    FS=" ";
    ORS=",\n";
    OFS=":";
};

{    
    if ( !a[city]++ && NR > 1 ) {
        key = $2;
        value = $1;
        print "\"" key "\"", value;
    }
};

END {
    ORS="\n";
    OFS=" ";
    print "\b\b}";
};

给我这个：

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15, <--- I don't want this comma
}

问题是最后一个数据行上的尾随逗号。它使 JSON 输出不可接受。我怎样才能得到这个输出：

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}

score 10 · Accepted Answer

介意对您发布的脚本的一些反馈吗？

#!/usr/bin/awk        # Just be aware that on Solaris this will be old, broken awk which you must never use
BEGIN {
    print "{";        # On this and every other line, the trailing semi-colon is a pointless null-statement, remove all of these.
    FS=" ";           # This is setting FS to the value it already has so remove it.
    ORS=",\n";
    OFS=":";
};

{
    if ( !a[city]++ && NR > 1 ) {      # awk consists of <condition>{<action} segments so move this condition out to the condition part
                                       # also, you never populate a variable named "city" so `!a[city]++` won't behave sensibly.
        key = $2;
        value = $1;
        print "\"" key "\"", value;
    }
};

END {
    ORS="\n";                          # no need to set ORS and OFS when the script will no longer use them.
    OFS=" ";
    print "\b\b}";                     # why would you want to print a backspace???
};

所以你的原始脚本应该写成：

#!/usr/bin/awk
BEGIN {
    print "{"
    ORS=",\n"
    OFS=":"
}

!a[city]++ && (NR > 1) {    
    key = $2
    value = $1
    print "\"" key "\"", value
}

END {
    print "}"
}

这是我真正编写脚本以将您发布的输入转换为您发布的输出的方式：

$ cat file
31 Stockholm
42 Talin
34 Helsinki
24 Moscow
15 Tokyo
$
$ awk 'BEGIN{print "{"} {printf "%s\"%s\":%s",sep,$2,$1; sep=",\n"} END{print "\n}"}' file
{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}

score 2 · Accepted Answer

你有几个选择。一个简单的方法是在您即将写出新行时添加上一行的逗号：

first = 1在你的BEGIN.
当要打印一行时，检查first。如果是1，则只需将其设置为0。如果0打印出逗号和换行符：
```
if (first) { first = 0; } else { print ","; }
```
这样做的目的是避免在列表的开头放置额外的逗号。
使用printf("%s", ...)而不是print ...这样可以在打印记录时避免换行。
在右大括号之前添加一个额外的换行符，如下所示：print "\n}";

另外，请注意，如果您不关心美学，JSON 并不真正需要项目之间的换行符等。您可以只为整个 enchilada 输出一个大行。

score 1 · Accepted Answer

你真的应该使用json 解析器，但这里是如何使用的awk：

BEGIN {
    print "{"    
}
NR==1{
    s= "\""$2"\":"$1
    next
}
{
    s=s",\n\""$2"\":"$1
}
END {
    printf "%s\n%s",s,"}"
}

输出：

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}

score 0 · Accepted Answer

为什么不使用json解析器？不要强迫awk做某事不是设计来做的。这是使用的解决方案python：

import json

d = {}
with open("file") as f:
    for line in f:
       (val, key) = line.split()
       d[key] = int(val)

print json.dumps(d,indent=0)

这输出：

{
"Helsinki": 34, 
"Moscow": 24, 
"Stockholm": 31, 
"Talin": 42, 
"Tokyo": 15
}

json - 在没有尾随分隔符的情况下分隔 AWK 中的输出记录

4 回答 4

Related

Reference