c++ - 修改 gettext .pot 文件输出以排除空字符串或仅包含空格的字符串

Question

我的 c++ 源代码中有一个由 xgettext 生成的 .pot 文件，格式为：

#: file1.cpp:line
#: file2.cpp:line
msgid "" - empty string

#: file1.cpp:line
#: file2.cpp:line
msgid " \t\n\r" - string contains only spaces

#: file1.cpp:line
#: file2.cpp:line
msgid "real text"

然后我使用如下命令：

grep "#: " "$(POT_FILE)" | sed -e 's/^\(#: \)\(.*)/\2'

让唯一的文件名和行出现在输出中。

但问题是我不需要只包含空格的字符串的文件。

这很复杂，因为我必须在行序列 #: blablabla 旁边找到行 msgid "" 或类似的行，并根据字符串的内容绕过所有前面的行。

任何人都可以帮助执行这样的命令吗？

谢谢！

score 0 · Accepted Answer

如果我理解正确，请将以下内容放入可执行文件中：

#!/usr/bin/awk -f

BEGIN { FS="\"" } # make it easier to test the text for msgid

# clean "file:line" line and store it in an array called "a"
/^#: / { sub(/^#: /, "", $0); a[i++]=$0 }

/^msgid/ {
    if( valid_msgid() ) { for( j in a ) print a[j] }
    reset() # clear array a after every msgid encountered
    }

function reset() {
    for( j in a ) { delete a[j]  }
    i = 0
    }

# put your validity tests here.
# $2 won't contain the entire string if the gettext contains double quotes
function valid_msgid() {
    if( length($2) > 0 && $2 !~ /^ / ) return 1
    return 0
    }

如果我将上述内容放入一个名为awko然后chmod +x awko运行的文件中，awko data.pot我会得到以下信息：

#: file1.cpp:line
#: file2.cpp:line

如果您将“行”值转换为数字，则它与您的最后一部分相匹配。

技巧之一是"用作分隔符。如果您需要拒绝 msgid 包含的行"，那么您将不得不使用更复杂的解析来识别完整的消息文本。

我无权访问 xgettext，所以我不知道-示例坏行中的注释是来自您还是来自程序。xgettext 程序输出它们，分隔符可以更改为" -在valid_msgid().

c++ - 修改 gettext .pot 文件输出以排除空字符串或仅包含空格的字符串

1 回答 1

Related

Reference