所以我有如下数据:
Observed WRF
2014-06-28 12:00:00 0.000000 1.823554
2014-06-28 13:00:00 0.000000 1.001567
2014-06-28 14:00:00 0.000000 0.309840
2014-06-28 15:00:00 0.000000 0.889811
2014-06-28 16:00:00 0.000000 0.939780
2014-06-28 17:00:00 1.251794 1.271781
2014-06-28 18:00:00 1.610596 0.935092
2014-06-28 19:00:00 2.129068 0.868775
2014-06-28 20:00:00 2.326501 0.892550
...
2014-08-31 05:00:00 0.365868 2.463277
2014-08-31 06:00:00 0.281729 1.233760
2014-08-31 07:00:00 0.197590 0.427411
2014-08-31 08:00:00 0.127754 0.299558
2014-08-31 09:00:00 0.000000 0.571106
2014-08-31 10:00:00 0.000000 0.449634
2014-08-31 11:00:00 0.000000 0.324269
2014-08-31 12:00:00 0.000000 1.725650
我希望生成一个图表,上面有两组不同颜色的箱线图。现在,我一开始就不太擅长绘制箱线图,所以我的技术可能会让我失望。我已经生成了以下代码:
df7.boxplot(by='day',whis=[10,90],sym=' ',figsize=(16,8),color=((1,0.502,0),'black'))\
.legend(loc='lower center', bbox_to_anchor=(1.007, -0.06),prop={'size':16})
plt.subplots_adjust(left=.1, right=0.9, top=0.9, bottom=.2)
plt.title('Five Day WRF Model Comparison Near %.2f,%.2f' %(lat,lon),fontsize=24)
plt.ylabel('Hourly Wind Speed [$W/m^2$]',fontsize=18,color='black')
ax7=plt.gca()
ax7.xaxis.set_label_coords(0.5, -0.05)
plt.xlabel('Time',fontsize=18,color='black')
plt.show()
然后给了我:
File "<ipython-input-35-9945f2efb84e>", line 1, in <module>
df7.boxplot(by='D',whis=[10,90],sym=' ',figsize=(16,8),color=((1,0.502,0),'black')).legend(loc='lower center', bbox_to_anchor=(1.007, -0.06),prop={'size':16})
File "...\Anaconda2\lib\site-packages\pandas\core\frame.py", line 5581, in boxplot
return_type=return_type, **kwds)
File "...\Anaconda2\lib\site-packages\pandas\tools\plotting.py", line 2747, in boxplot
return_type=return_type)
File "...\Anaconda2\lib\site-packages\pandas\tools\plotting.py", line 3139, in _grouped_plot_by_column
grouped = data.groupby(by)
File "...\Anaconda2\lib\site-packages\pandas\core\generic.py", line 3778, in groupby
**kwargs)
File "...\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 1427, in groupby
return klass(obj, by, **kwds)
File "...\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 354, in __init__
mutated=self.mutated)
File "...\Anaconda2\lib\site-packages\pandas\core\groupby.py", line 2383, in _get_grouper
in_axis, name, gpr = True, gpr, obj[gpr]
File "...\Anaconda2\lib\site-packages\pandas\core\frame.py", line 1997, in __getitem__
return self._getitem_column(key)
File "...\Anaconda2\lib\site-packages\pandas\core\frame.py", line 2004, in _getitem_column
return self._get_item_cache(key)
File "...\Anaconda2\lib\site-packages\pandas\core\generic.py", line 1350, in _get_item_cache
values = self._data.get(item)
File "...\Anaconda2\lib\site-packages\pandas\core\internals.py", line 3290, in get
loc = self.items.get_loc(item)
File "...\Anaconda2\lib\site-packages\pandas\indexes\base.py", line 1947, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)
File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)
File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)
File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)
KeyError: 'D'
我希望箱线图按天或月排序,并且每个都用两种不同的颜色绘制在同一个图表上,也就是说,一个橙色和另一个黑色基本上被覆盖,这样人们就可以辨别两者之间的差异。如果在看起来一团糟的情况下这是不可能的,那么在两个不同的图表上绘制,作为一个图上的子图(我可以这样做。)但是,排序似乎搞砸了。我无法弄清楚为什么我的日期时间索引无法按天或按 7 天对其进行排序。我也试过
df7.boxplot(by=df07.index.day,whis=[10,90],sym=' ',figsize=(16,8),color=((1,0.502,0),'black'))\
.legend(loc='lower center', bbox_to_anchor=(1.007, -0.06),prop={'size':16})
...
然后给了我:
AssertionError: Grouper and axis must be same length
我不确定发生了什么,但它似乎没有认识到datetimeIndex
即使当我这样做时df7.info()
,我也会返回:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1542 entries, 2014-06-28 12:00:00 to 2014-08-31 12:00:00
Data columns (total 2 columns):
Observed 1542 non-null float64
WRF 1542 non-null float64
dtypes: float64(2)
memory usage: 36.1 KB
所以它似乎是datetimeIndex
格式。
感谢您提供任何和所有帮助,如果需要进一步澄清,我非常乐意提供额外的信息。