3

Im importing a csv with Pandas in IPython. When displaying the DataFrame it looks like:

     2013    2012    2011    2010    2009    2008    2007    2006    2005
Jan  11,875  10,989  10,852  11,762  13,850  14,269  14,075  9,222   -
Feb  10,206  10,501  15,713  11,785  13,886  14,289  12,635  13,149  -
Mar  11,235  11,991  14,193  14,239  15,528  14,589  14,519  10,179  -
Apr  NaN     13,617  12,945  14,682  16,953  18,054  14,954  10,549  -
May  NaN     14,645  15,524  15,861  12,357  18,833  16,511  12,889  -
Jun  NaN     14,987  17,740  26,616  13,947  19,580  18,161  13,969  -
Jul  NaN     13,514  19,082  19,880  16,199  20,522  16,537  14,038  -
Aug  NaN     12,830  14,785  16,125  23,438  16,018  16,645  12,430  1,729
Sep  NaN     12,070  13,232  17,081  16,997  16,543  14,372  12,400  5,414
Oct  NaN     11,907  11,027  17,995  12,576  13,535  17,169  14,673  4,920
Nov  NaN     10,623  12,127  12,439  11,926  12,491  13,530  14,313  7,993
Dec  NaN     8,624   8,952   10,498  12,811  14,552  11,573  10,780  6,879
TOTAL    33,316  146,298     166,172     188,963     180,468     193,275     180,681     148,591     26,935

Now I want to plot the data in a graph, but no matter what I try I get "TypeError: Empty 'DataFrame': no numeric data to plot"

Obviously the DataFrame isn't empty, and is full of numbers. What am I missing? I was under the impression that Pandas identified numbers all on its own.

4

2 回答 2

3

感谢所有的建议!它为我指明了正确的方向。我设法解决了这个问题

df = df.replace(',', '', regex=True)
df = df.replace('-', 'NaN', regex=True).astype('float')
df.plot()
于 2013-09-09T20:01:43.120 回答
2

获取您的数据,并将“,”替换为“。”,将“-”替换为“NaN”,它可以工作:

>>> s="""     2013    2012    2011    2010    2009    2008    2007    2006    2005
Jan  11,875  10,989  10,852  11,762  13,850  14,269  14,075  9,222   -
Feb  10,206  10,501  15,713  11,785  13,886  14,289  12,635  13,149  -
Mar  11,235  11,991  14,193  14,239  15,528  14,589  14,519  10,179  -
Apr  NaN     13,617  12,945  14,682  16,953  18,054  14,954  10,549  -
May  NaN     14,645  15,524  15,861  12,357  18,833  16,511  12,889  -
Jun  NaN     14,987  17,740  26,616  13,947  19,580  18,161  13,969  -
Jul  NaN     13,514  19,082  19,880  16,199  20,522  16,537  14,038  -
Aug  NaN     12,830  14,785  16,125  23,438  16,018  16,645  12,430  1,729
Sep  NaN     12,070  13,232  17,081  16,997  16,543  14,372  12,400  5,414
Oct  NaN     11,907  11,027  17,995  12,576  13,535  17,169  14,673  4,920
Nov  NaN     10,623  12,127  12,439  11,926  12,491  13,530  14,313  7,993
Dec  NaN     8,624   8,952   10,498  12,811  14,552  11,573  10,780  6,879
TOTAL    33,316  146,298     166,172     188,963     180,468     193,275     180,681     148,591     26,935"""

>>> s=s.replace(',','.')    
>>> s=s.replace('-','NaN')    
>>> df=pd.read_csv(StringIO(s), sep='\s*')
>>> df.plot()
<matplotlib.axes.AxesSubplot at 0x88a4790>

有趣的是,从read_csv文档字符串中,有一个指定小数分隔符的参数,但它似乎不适用于我的版本(0.11.0)。

于 2013-09-09T12:41:20.553 回答