python-3.x - 如何将整个数据框行添加为散点图注释

Question

我在散点图上绘制了 Pandas DataFrame 的两列，我希望每个点都显示 DataFrame 的所有行值。我看过这篇文章，并尝试用 mplcursors做类似的事情：

import pandas as pd
from datetime import date, datetime, time, timedelta
import numpy as np
import matplotlib.pyplot as plt
from mplcursors import cursor

df = pd.DataFrame()
df['datetime'] = pd.date_range(start='2016-01-01', end='2016-01-14', freq='30T')
#df = df.set_index('datetime')
df['x1'] = np.random.randint(-30, 30, size=len(df))
df['x2'] = np.random.randint(-30, 20, size=len(df))
df['x3'] = np.random.randint(-20, 30, size=len(df))
df['y1'] = np.random.randint(-100, 100, size=len(df))
df['y2'] = np.random.randint(-300, 200, size=len(df))
df['y3'] = np.random.randint(-200, 300, size=len(df))

def conditions(s):
    if (s['y1'] > 20) or (s['y3'] < 0):
        return 'group1'
    elif (s['x3'] < 20):
        return 'group2'
    elif (s['x2'] == 0):
        return 'group3'
    else:
        return 'group4'

df['category'] = df.apply(conditions, axis=1)

fig = plt.figure(figsize=(12,4))

ax1 = plt.subplot(121)
ax1.scatter(df.x1, df.y1, label='test1')
ax1.scatter(df.x2, df.y2, label='test2')
#cursor(hover=True)
ax1.set_xlabel('test1')
ax1.set_ylabel('test2')
ax1.legend(['test1','test2'])
cr1 = cursor(ax1,hover=True)
#ax1.annotation_names = df.columns.tolist()
cr1.connect("add", lambda x: x.annotation.set_text(df.columns.tolist()[x.target.index]))

ax2 = plt.subplot(122)
ax2.scatter(df.x1, df.y1, label='test1')
ax2.scatter(df.x3, df.y3, label='test3')
ax2.set_xlabel('test1')
ax2.set_ylabel('test3')
ax2.legend(['test1','test3'])
cr2 = cursor(ax2,hover=True)
#ax2.annotation_names = df.columns.tolist()
cr2.connect("add", lambda x: x.annotation.set_text(df.columns.tolist()[x.target.index]))

# save figure
import pickle
pickle.dump(fig, open('FigureObject.fig.pickle', 'wb'))
plt.show()

当我将鼠标悬停在一个点上时，我想看到一个包含（例如）的标签：

datetime = 2016-01-01 00:00:00 
x1 = 1 
x2 = -4 
x3 = 22 
y1 = -42 
y2 = -219 
y3 = -158    
category = group1

但我收到这种类型的错误：

cr2.connect("add", lambda x: x.annotation.set_text(df.columns.tolist()[x.target.index]))
IndexError: list index out of range

我如何解决它？

score 3 · Accepted Answer

IndexError发生的原因是df.columns.tolist()[x.target.index]
- df.columns.tolist()是 7 列的列表，然后由[x.target.index].
df.iloc[x.target.index, :].to_dict()将获得该点所需的行数据作为dict
- Alist comprehension为每key value对创建一个字符串列表
- '\n'.join(...)创建一个字符串，每列由 a 分隔\n

cr1.connect("add", lambda x: x.annotation.set_text('\n'.join([f'{k}: {v}' for k, v in df.iloc[x.target.index, :].to_dict().items()])))

或者，使用.to_string()

cr1.connect("add", lambda x: x.annotation.set_text(df.iloc[x.target.index, :].to_string()))

python-3.x - 如何将整个数据框行添加为散点图注释

1 回答 1

Related

Reference