5

问题networkx:将图形从pytorch 几何转换时如何保留节点顺序/标签?

代码:(在 Google Colab 中运行)

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx

import torch
from torch.nn import Linear
import torch.nn.functional as F
torch.__version__

# install pytorch geometric
!pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.10.0+cpu.html

from torch_geometric.nn import GCNConv
from torch_geometric.utils.convert import to_networkx, from_networkx

# Make the networkx graph
G = nx.Graph()

# Add some cars 
G.add_nodes_from([
      ('Ford', {'y': 0, 'Name': 'Ford'}),
      ('Lexus', {'y': 1, 'Name': 'Lexus'}),
      ('Peugot', {'y': 2, 'Name': 'Peugot'}),
      ('Mitsubushi', {'y': 3, 'Name': 'Mitsubishi'}),
      ('Mazda', {'y': 4, 'Name': 'Mazda'}),
])

# Relabel the nodes
remapping = {x[0]: i for i, x in enumerate(G.nodes(data = True))}

G = nx.relabel_nodes(G, remapping, copy=False)

# Add some edges --> A = [(0, 1, 0, 1, 1), (1, 0, 1, 1, 0), (0, 1, 0, 0, 1), (1, 1, 0, 0, 0), (1, 0, 1, 0, 0)] as the adjacency matrix
G.add_edges_from([
                  (0, 1), (0, 3), (0, 4),
                  (1, 2), (1, 3),
                  (2, 1), (2, 4), 
                  (3, 0), (3, 1),
                  (4, 0), (4, 2)
])

# Convert the graph into PyTorch geometric
pyg_graph = from_networkx(G)

pyg_graph.edge_index

当我在代码的最后一行打印边缘索引时,每次运行它都会得到不同的答案。最重要的是,我希望始终获得相同(正确)的答案,从而从 networkx 中保留每个节点编号:

tensor([[0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 4, 4],
        [4, 2, 4, 2, 3, 0, 1, 1, 4, 0, 1, 3]])

这个边缘索引张量的形式是:

  • 第一个列表包含节点的节点 ID
  • 第二个列表包含目标节点的节点 ID

对于要保留的节点 ID,我们希望节点 0 在第一个(源)列表中出现 3 次,而不是仅出现两次。

我有什么办法可以强制 PyTorch Geometric 复制节点 ID?

谢谢

[编辑] 我有一种可能的解决方法是使用以下代码,它能够为 PyTorch 几何生成边缘索引和权重张量

# Create a dictionary of the mappings from company --> node id
mapping_dict = {x: i for i, x in enumerate(list(G.nodes()))}

# Get the number of nodes
num_nodes = len(mapping_dict)

# Now create a source, target, and edge list for PyTorch geometric graph
edge_source_list = []
edge_target_list = []
edge_weight_list = []

# iterate through all the edges
for e in G.edges():
  # first element of tuple is appended to source edge list
  edge_source_list.append(mapping_dict[e[0]])

  # last element of tuple is appended to target edge list
  edge_target_list.append(mapping_dict[e[1]])

  # add the edge weight to the edge weight list
  edge_weight_list.append(1) 


# now create full edge lists for pytorch geometric - undirected edges need to be defined in both directions

full_source_list = edge_source_list + edge_target_list      # full source list
full_target_list = edge_target_list + edge_source_list      # full target list
full_weight_list = edge_weight_list + edge_weight_list      # full edge weight list

print(len(edge_source_list), len(edge_target_list), len(full_source_list))

# now convert these to torch tensors
edge_index_tensor = torch.LongTensor( np.concatenate([ [np.array(full_source_list)], [np.array(full_target_list)]] ))
edge_weight_tensor = torch.FloatTensor(np.array(full_weight_list))
4

1 回答 1

3

评论中似乎解决了这个问题(@Sparky05 提出的解决方案是使用copy=True,这是 的默认值nx.relabel_nodes),但下面是对节点顺序更改原因的解释。

copy=False传递时,将按照它们在dictnx.relabel_nodes的键集中出现的顺序将节点重新添加到图中。remapping代码中的相关行在这里

def _relabel_inplace(G, mapping):
    old_labels = set(mapping.keys())
    new_labels = set(mapping.values())
    if len(old_labels & new_labels) > 0:
        # skip codes for labels sets that overlap
    else:
        # non-overlapping label sets
        nodes = old_labels

    # skip lines
    for old in nodes: # this is now in the set order

通过使用set节点的顺序被修改,因此为了保持顺序,非重叠标签集应被视为:

    else:
        # non-overlapping label sets
        nodes = mapping.keys()

相关的 PR在这里提交。

于 2022-01-14T05:05:58.553 回答