python - 如何在 scipy 中读取稀疏有向图

Question

我有一个稀疏加权有向图在一个文件中表示，每一行的格式

从体重

我想把它读成 scipy 的压缩稀疏格式，这样我就可以在它上面执行简单的遍历和图形算法（或者实际上是任何内存高效的表示）。但是，给定一个节点，我希望能够快速按重量顺序列出其所有传出边，而不必每次都对其进行排序。当然，可以对每个排序一次。

是否可以在 scipy 或使用任何其他 python 包中执行此操作？

score 2 · Accepted Answer

您可以使用以下内容加载数据：

import numpy as np
import scipy.sparse as sps

data = np.genfromtxt('data.txt', dtype=[('from', np.intp),
                                        ('to', np.intp),
                                        ('weight', np.float)])

如果您想将权重存储在稀疏矩阵中，从节点到节点的权重在graph哪里，您可以执行以下操作：graph[i, j]ij

graph = sps.csr_matrix((data['weight'], (data['from'], data['to'])))

为了有一个传出节点的排序列表，我会使用一个字典，其中sorted_to是一个按权重排序的传出节点数组。这有点 hacky，并且依赖于 CSR 稀疏矩阵格式，但你可以这样做：

graph = sps.rand(10, 10, density=0.1, format='csr')
data, indptr, indices = graph.data, graph.indptr, graph.indices
non_empty_rows, = np.nonzero(np.diff(graph.indptr))
sorted_out = {}
for j in non_empty_rows:
    weight_slice = data[indptr[j]:indptr[j+1]]
    out_slice = indices[indptr[j]:indptr[j+1]]
    sorted_out[j] = out_slice[np.argsort(weight_slice)]

用一个简单的例子：

>>> graph = sps.rand(5, 5, density=0.2, format='csr')
>>> graph.toarray()
array([[ 0.88968871,  0.        ,  0.        ,  0.80773932,  0.        ],
       [ 0.        ,  0.        ,  0.8921645 ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.18552664,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.22945956]])
>>> non_empty_rows
array([0, 1, 3, 4], dtype=int64)
>>> sorted_out
{0: array([3, 0]), 1: array([2]), 3: array([0]), 4: array([4])}

python - 如何在 scipy 中读取稀疏有向图

1 回答 1

Related

Reference