I am having of transferring a DataFrame into a GraphFrame using the data below. Let's consider a column of Authors in a dataframe containing an array of Strings like the one below:
+-----------+------------------------------------+
|ArticlePMID| Authors |
+-----------+------------------------------------+
| PMID1 |['Author 1', 'Author 2', 'Author 3']|
| PMID2 |['Author 4', 'Author 5'] |
+-----------+------------------------------------+
In the data table, we have a list of authors who collaborated together on the same paper. Now I want to expand the second column into a new dataframe containing the following structure:
+---------------+---------------+
| Collaborator1 | Collaborator2 |
+---------------+---------------+
| 'Author 1' | 'Author 2' |
| 'Author 1' | 'Author 3' |
| 'Author 2' | 'Author 3' |
| 'Author 4' | 'Author 5' |
+---------------+---------------+
I tried to use the explode function, but that only expands the array into a single column of authors and I lose the collaboration network.
Can some please tell me how to go around this?