GraphLab 等价于以下 NetworkX 代码是什么?
for nodeset in nx.connected_components(G):
在 GraphLab 中,我想为每个连接的组件获取一组顶点 ID。
返回的组件 IDgraphlab.graph_analytics.connected_components
采用 SFrame 的形式,因此获取给定组件的 ID 的最简单方法是过滤 SFrame:
# Make a graph with two components.
import graphlab
G = graphlab.SGraph().add_edges(
[graphlab.Edge(i, i+1) for i in range(3)])
G = G.add_edges([graphlab.Edge(i, i+1) for i in range(4, 6)])
# Get the connected components.
cc = graphlab.connected_components.create(G)
# Find the vertices for a given component (0, in this example).
nodes = cc.component_id.filter_by(0, 'component_id')
print nodes
+------+--------------+
| __id | component_id |
+------+--------------+
| 5 | 0 |
| 6 | 0 |
| 4 | 0 |
+------+--------------+
[3 rows x 2 columns]
这是从 NetworkX 移植到 GraphLab 的第一次剪辑。但是,迭代似乎非常缓慢。
temp1 = cc['component_id']
temp1.remove_column('__id')
id_set = set()
id_set = temp1['component_id']
for item in id_set:
nodeset = cc_out[cc_out['component_id'] == item]['__id']