python - subprocess.Popen(..).communicate(..) 与graphviz一起使用时随机丢弃数据！

Question

我正在使用 graphviz 的 dot 为 Web 应用程序生成一些 svg 图。我使用 Popen 调用 dot：

    p = subprocess.Popen(u'/usr/bin/dot -Kfdp -Tsvg', shell=True,\
    stdin=subprocess.PIPE, stdout=subprocess.PIPE)
    str = u'long-unicode-string-i-want-to-convert'
    (stdout,stderr) = p.communicate(str)

发生的事情是 dot 程序抛出如下错误：

    Error: not well-formed (invalid token) in line 1 
 ... <tr><td cellpadding="4bgcolor="#EEE8AA"> ...
in label of node n260

这个明显的错误肯定不在输入字符串中。特别是，如果我使用 utf-8 编码将其保存到 str.txt 并执行

/usr/bin/dot -Kfdp -Tsvg < str.txt > myimg.svg

我得到了想要的输出。str 唯一的“特殊”之处在于它包含像丹麦语 øæå 这样的字符。

现在我不知道我应该做什么。问题很可能出在点上；但它肯定似乎是由 Popen 触发的，它与从 shell 中使用 < 不同，我不知道从哪里开始。任何有关替代调用 dot 的帮助或想法（除了将所有数据写入文件并调用它！）将不胜感激！

score 3 · Accepted Answer

听起来你应该这样做：

stdout, stderr = p.communicate(str.encode('utf-8'))

（当然，除了你不应该隐藏内置的str.）Python 中的 unicode 类型保存 unicode 数据，而不是UTF-8。如果您想要 UTF-8，则需要对其进行显式编码。

最重要的是，没有理由shell=True在该片段中使用，也没有将 unicode 文字传递给 subprocess.Popen 是一个特别好的主意（它只是被编码为 ASCII。）最后的反斜杠是不必要的——Python 知道该行继续，因为您有一个尚未关闭的左括号。所以，使用：

p = subprocess.Popen(['/usr/bin/dot', '-Kfdp', '-Tsvg'],
    stdin=subprocess.PIPE, stdout=subprocess.PIPE)

python - subprocess.Popen(..).communicate(..) 与graphviz一起使用时随机丢弃数据！

1 回答 1

Related

Reference