11

我正在使用 pydot 在 python 中绘制图形。我想代表一个决策树,比如(a1,a2,a3 是属性,两个类是 0 和 1:

       a1>3
      /    \
  a2>10    a3>-7
   /  \     /  \
  1    0   1    0

然而,使用 pydot,只创建了两片叶子,树看起来像这样(附有 png):

       a1>3
      /    \
  a2>10    a3>-7
      |  X  |
      1     0

现在,在这个简单的情况下,逻辑很好,但在较大的树中,属于不同分支的内部节点是统一的。

我正在使用的简单代码是:

import pydot
graph = pydot.Dot(graph_type='graph')
edge = pydot.Edge("a_1>3", "a_2>10")
graph.add_edge(edge)
edge = pydot.Edge("a_1>3", "a_3>-7")
graph.add_edge(edge)
edge = pydot.Edge("a_2>10", "1")
graph.add_edge(edge)
edge = pydot.Edge("a_2>10", "0")
graph.add_edge(edge)
edge = pydot.Edge("a_3>-7", "1")
graph.add_edge(edge)
edge = pydot.Edge("a_3>-7", "0")
graph.add_edge(edge)
graph.write_png('simpleTree.png')

我还尝试创建不同的节点对象而不是创建边并将其添加到图中,但似乎 pydot 检查节点池中是否有同名的节点,而不是创建新的节点。

有任何想法吗?谢谢!

上面代码创建的图像

4

2 回答 2

17

您的节点总是需要一个唯一的名称,否则您不能唯一地命名它们以在它们之间附加边。但是,您可以给每个节点一个标签,这是渲染时显示的内容。

因此,您需要添加具有唯一 ID 的节点:

graph = pydot.Dot(graph_type='graph')
graph.add_node(pydot.Node('literal_0_0', label='0'))
graph.add_node(pydot.Node('literal_0_1', label='0'))
graph.add_node(pydot.Node('literal_1_0', label='1'))
graph.add_node(pydot.Node('literal_1_1', label='1'))

然后添加连接这些节点的图边:

edge = pydot.Edge("a_2>10", "literal_0_0")
graph.add_edge(edge)
edge = pydot.Edge("a_2>10", "literal_1_0")
graph.add_edge(edge)
edge = pydot.Edge("a_3>-7", "literal_0_1")
graph.add_edge(edge)
edge = pydot.Edge("a_3>-7", "literal_1_1")
graph.add_edge(edge)

连同您定义的其余边,这使得:

具有正确边的图

于 2012-10-22T13:55:34.320 回答
2

The "canonical" answer is to use the uuid module from the standard library, as networkx does here.

This is better than using id to create node names for pydot that correspond to the nodes in your original graph, because if (in theory) a node object gets deleted while you are building your pydot graph, then that id won't necessarily be unique. In contrast, the UUID objects created are unique, persistent and independent of the lifespan of the original nodes.

However for this to happen, something very weird must be going on while you create the pydot graph, which is rather unlikely. The advantage of using id is that you don't need to build and pass around a mapping from original nodes to UUID objects (so that you construct consistently the edges after adding the nodes).

One interesting case are nested graphs: two different graphs may contain the same hashable object in networkx (say a), then id cannot be used any more directly on the node. But in that case, id can still be used, by combining the (node, graph) pair as: str(id(node)) + str(id(graph)).

于 2014-09-10T02:13:46.257 回答