0

I'm building a simple scatter plot (Life expectancy x GDP per capita) that reads data from a xls file. Here's the code:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.cm as cm

#ler a terceira sheet da planilha
data = pd.read_excel('sample.xls', sheet_name=0)
data.head()

plt.scatter(x = data['LifeExpec'],
        y = data['GDPperCapita'],
        s = data['PopX1000'],
        c = data['PopX1000'],
        cmap=cm.viridis,
        edgecolors = 'none',
        alpha = 0.7)

for state in range(len(data['State'])):
    plt.text(x = data['LifeExpec'][state],
         y = data['GDPperCapita'][state],
         s = data['State'][state],
         fontsize = 14)

plt.colorbar()
plt.show()

The xls file: enter image description here

The plot: enter image description here

Now I want to add some data to this xls file from other years, and animate the bubbles so they move and change sizes according the GDP and population numbers of each year. In a silly attempt to do so, I've changed the code to this:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import mplcursors
from matplotlib.animation import FuncAnimation

data = pd.read_excel('sample.xls', sheet_name=0)
data.head()
uniqueYears = data['Year'].unique()

fig, ax = plt.subplots()

def animate(i):
    for i in uniqueYears:
        ax.scatter(x = data['lifeExpec'],
            y = data['GDPperCapita'],
            s = data['PopX1000']/4,
            c = data['Region'].astype('category').cat.codes,
            cmap=cm.viridis,
            edgecolors = 'none',
            alpha = 0.7)

anim = FuncAnimation(fig, animate)

for state in range(len(data['State'])):
    plt.text(x = data['lifeExpec'][state],
             y = data['GDPperCapita'][state],
             s = data['State'][state],
             fontsize = 10,
             ha = 'center',
             va = 'center')

mplcursors.cursor(hover=True)
plt.draw()
plt.show()

I thought that maybe the way to do this would be to use the animate function to build the chart multiple times, one iteration per year. But I couldn't figure out how to "filter" the rows regarding to that specific year.

Am I too off? Is it even possible to achieve using matplotlib?

4

2 回答 2

1

使用 Stylianos Nikas 和 ImportanceOfBeingErnest 所说的作为起点,我制作了一个包含数据框中唯一年份的列表,并将其长度用作 FuncAnimation 中的参数,如下所示:

def animate(frames):           
    ax.clear()    
    data = df[df['Ano'] == uniqueYears[frames]]
    ax.scatter(y = data['lifeExpec'],
    x = data['GPDperCapita'],
    s = data['PopX1000']/40000,
    c = data['Region'].astype('category').cat.codes,
    cmap = cm.viridis,
    edgecolors = 'none',
    alpha = 0.5)

anim = FuncAnimation(fig, animate, frames = len(uniqueYears),interval = 200, repeat = False)

为了避免帧重叠,我只是将 ax.clear() 添加到 animate 函数的开头。

于 2018-05-30T19:31:13.217 回答
1

You can filter your rows with a simple if statement. make a list with the years you want to plot i.e. list=[2000,2001,2002]. then iterate over the list

for i in range (0,2):
   if x=list[i]:
      #do whatever you want

Where x is the data from your F column that contains the years.

You can also just save the figures according to the year name

plt.savefig("{}.png".format(i))

and then just use this command to create the animation:

ffmpeg -framerate 25 -i %d.png -c:v libx264 -profile:v high -crf 20 -pix_fmt yuv420p output.mp4

you can then remove the saved plots with rm *.png

You will need to import os in your script or do it manually through the command line after your script has created your plots. I think this is a much easier way to address the problem.

Cheers

于 2018-05-28T13:11:35.887 回答