I have the following dataset:
firm_id firm_id_
1 2
1 4
1 5
2 1
2 3
3 2
3 6
4 1
4 5
4 6
5 4
5 7
6 3
...
This data says for exampe that firm_id
= 1 is directly connected to firm_id
= 2, 4, and 5 and indirectly connected (within two paths) to firm_id
= 3, 6, and 7. I can use some Python package like networkx
to build the network of firm's connectivity. Now, I want to use Spectral Clustering (I guess this the correct methodology) to form clusters based on distance (number of edges separating each firm) and see how these clusters are connected to each other.
I would first define an adjacency matrix W of the above data. I then use where dist is the distance between firm i and firm j, and c is a scale parameter to each element in W and then compute the Laplacian matrix (see here for example).
Now my question is can Spectral Clustering give me the link between each clusters and how far apart are the clusters (how many edges separate the clusters)? I was thinking to use this, the scikit
package in Python but I have no idea how I can generate the links between clusters using sklearn.cluster
.