3

我正在用 Python 进行 LDA 主题建模,以下是我的可视化代码:

import pyLDAvis.gensim
pyLDAvis.enable_notebook()
vis = pyLDAvis.gensim.prepare(lda_model, corpus, dictionary=lda_model.id2word)
vis

我正在寻找一种将主题间距离图图导出为 PDF 的方法,或者至少使用 matplotlib 绘制它,然后另存为 pdf,知道吗?

4

1 回答 1

2

您可以以 JSON 格式导出模型,然后将其与 matplotlib 一起使用

# Export results in JSON format

pyLDAvis.enable_notebook()
vis = pyLDAvis.gensim.prepare(lda_model, corpus, id2word)
vis
pyLDAvis.save_json(vis, '/results/lda.json')

# Read JSON file

import json

with open('/results/lda.json', 'r') as myfile:
    data=myfile.read()

json_data = json.loads(data)


# Plot with matplotlib

import matplotlib.pyplot as plt

x_max = max(json_data['mdsDat']['x']) + (max(json_data['mdsDat']['x']) - min(json_data['mdsDat']['x'])) 
y_max = max(json_data['mdsDat']['y']) + (max(json_data['mdsDat']['y']) - min(json_data['mdsDat']['y'])) 
x_min = min(json_data['mdsDat']['x']) - (max(json_data['mdsDat']['x']) - min(json_data['mdsDat']['x'])) 
y_min = min(json_data['mdsDat']['y']) - (max(json_data['mdsDat']['y']) - min(json_data['mdsDat']['y']))

plt.axis([x_min, x_max, y_min, y_max])

# Depending on the number of topics, you may need to tweak the paremeters (e.g. the size of circles be Freq/100 or Freq/200, etc)

for i in range(len(json_data['mdsDat']['x'])):
    circle = plt.Circle((json_data['mdsDat']['x'][i],json_data['mdsDat']['y'][i]), radius = json_data['mdsDat']['Freq'][i]/100)
    plt.gca().add_artist(circle)
    
plt.show()
于 2020-11-14T19:25:56.333 回答