0

我正在尝试绘制几个子图来分析每个日期、每个座席的平均通话持续时间。我从 SQL 表中读取该信息并加载到 Panda Dataframe 中。并非所有代理共享相同的天数,甚至相同的日期,因此共享 x=True 没有意义。

我想出了这个:

    import pandas as pd
    from pandas import DataFrame
    import matplotlib.pyplot as plt 

    df2= df.groupby(['agent_id', 'call_date'])['duration_minutes'].mean()
    #Figure out number of rows needed for 2 column grid plot
    #Also accounts for odd number of plots
    group_len = len(df2.groupby('agent_id'))
    #nrows = int(math.ceil(group_len/2.))

    #Setup Subplots
    fig, axs = plt.subplots(group_len,1,sharex=False, sharey=True)

    for i,var in enumerate(df2.groupby('agent_id')):
        agent_id = var[0]
        #print(df2[agent_id])
        df2[agent_id].plot(x ='call_date', y='duration_minutes',
                           kind = 'line',legend=False, ax=axs[i],marker='*')
        axs[i].tick_params(axis='both', which='both', labelsize=7)
        axs[i].legend(['Agent Id: ' + str(agent_id)])
        #axs[i].set_title('Agent Id: ' + str(i),fontsize=8)
        #axs[i].yaxis.set_ticks_position('none')
        axs[i].set_xlabel('Day')
        #axs[i].set_ylabel('Agent Id: ' + str(i),fontsize=8)

    #plt.xticks(rotation=90)
    plt.suptitle('Avg call duration per day, per agent', verticalalignment='bottom', fontsize=12) 
    plt.tight_layout()

    #df1.plot(x ='agent_id', y='duration_minutes', kind = 'bar', title='Avg Call duration per agent')
    plt.show()  

    df2= df.groupby(['agent_id', 'call_date'])['duration_minutes'].mean()
    #Figure out number of rows needed for 2 column grid plot
    #Also accounts for odd number of plots
    group_len = len(df2.groupby('agent_id'))
    #nrows = int(math.ceil(group_len/2.))

    #Setup Subplots
    fig, axs = plt.subplots(group_len,1,sharex=False, sharey=True)

    for i,var in enumerate(df2.groupby('agent_id')):
        agent_id = var[0]
        #print(df2[agent_id])
        df2[agent_id].plot(x ='call_date', y='duration_minutes',
                           kind = 'line',legend=False, ax=axs[i],marker='*')
        axs[i].tick_params(axis='both', which='both', labelsize=7)
        axs[i].legend(['Agent Id: ' + str(agent_id)])
        #axs[i].set_title('Agent Id: ' + str(i),fontsize=8)
        #axs[i].yaxis.set_ticks_position('none')
        axs[i].set_xlabel('Day')
        #axs[i].set_ylabel('Agent Id: ' + str(i),fontsize=8)

    #plt.xticks(rotation=90)
    plt.suptitle('Avg call duration per day, per agent', verticalalignment='bottom', fontsize=12) 
    plt.tight_layout()

    #df1.plot(x ='agent_id', y='duration_minutes', kind = 'bar', title='Avg Call duration per agent')
    plt.show()  

这给出了这样的东西: subplots_output

我想改进这个输出,但我尝试了很多东西,有时没有运气。我希望能够使用 Panda 的数据框,所以我将研究范围缩小到Cufflinks,我现在正在使用它。我想出了这个解决方案,但如果可能的话,我希望每张图都有 legend=agent_id 和一种颜色。

import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
#import seaborn as sns
#Cufflinks is a 3rd wrapper library around Plotly, inspired by the Pandas .plot() API.
import cufflinks as cf
from plotly.offline import iplot

 df = pd.DataFrame(SQL_Query,columns=['id','agent_id','duration_minutes','call_date','inbound'])
 # 2) Get the avg of duration per agent, per day
 df2= df.groupby(['agent_id', 'call_date'])['duration_minutes'].mean()

                fig_array = []
    for i,var in df2.groupby('agent_id'):
        #print(var)
        agent_id= var[0]
        #print('--------------------------------------------------')
        fig = var.reset_index().iplot(theme='pearl',asFigure=True
                ,x ='call_date', y='duration_minutes',
                kind = 'line',
                xTitle='', yTitle='Duration (min)',
                title=str(agent_id),
                world_readable=True)
        fig.update_layout(showlegend=False)
        fig.update_traces(texttemplate='%{y:.2f}',
                hovertemplate='<b>Day: </b>%{x} <br><b>Avg duration(min): </b>%{y}')
        fig_array.append(fig)     

    fig = cf.subplots(fig_array,shape=(group_len,1))
    #iplot(fig)
    plot(fig, filename='avg_duration_per_day_per_agent.html')

CSV 文件(我从 sql 表中读取,但它是相同的)是这样的:id,agentid,duration,date,inbound

1,3,10.52,2019/05/01,true
2,1,12.93,2019/04/06,false
3,2,10.32,2019/06/14,true
4,3,8.84,2019/06/13,false
5,3,13.43,2019/05/06,false
6,3,4.78,2019/05/04,false
7,1,9.21,2019/06/21,true
8,5,9,2019/05/26,true
9,5,12.49,2019/06/04,true
10,3,3.68,2019/05/05,false
11,2,6.06,2019/06/22,false
12,4,7.66,2019/06/20,false
13,2,6.17,2019/06/15,true
14,4,13.6,2019/06/26,true
...

avg_in_cufflinks

我想展示一个更直观/漂亮的图表,但我坚持自定义这些多图表,因为我无法隐藏子图的图例、标题等。不过,我只用一张图就完美地做到了。如何放置 legend= agent_id 和一个标题,以及每个 x 轴和 y 轴的标题?不管用。

4

0 回答 0