我有数据集,其中包含在 42 小时内模糊更新的预测数据。这是一个示例:
df_old = pd.DataFrame({'IssueDatetime': ['2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00','2010-01-01 09:00:00','2010-01-01 09:00:00'],
'endtime':['2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00'],
'Regions': ['EAST COAST-CAPE ST FRANCIS AND SOUTH', 'EAST COAST-CAPE ST FRANCIS AND SOUTH', 'EAST COAST-CAPE ST FRANCIS AND SOUTH','NORTHEAST COAST','NORTHEAST COAST'],
'forecastTime': ['2010-01-01 09:00:00','2010-01-01 15:00:00','2010-01-01 19:00:00','2010-01-01 09:00:00','2010-01-01 12:00:00'],
'forecast_Dir':[150,180,45,45,45],
'windSpeed':[20,90,35,45,15]})
问题是 df['forecastTime'] 和 df['endtime] 的小时数之间的差距。我尝试使用我有限的 pandas 知识对数据进行分组和重新采样,但是由于日期重复,我无法获得日期时间索引。
最终,我的目标是扩展数据框,以便数据框中原始小时数之间的小时数有自己的行,直到结束时间...
所需输出的示例:
df_new = pd.DataFrame({'IssueDatetime': [ '2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00','2010-01-01 09:00:00'],
'endtime':['2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00'],
'Regions': ['EAST COAST-CAPE ST FRANCIS AND SOUTH', 'EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH'],
'forecastTime': ['2010-01-01 09:00:00','2010-01-01 10:00:00','2010-01-01 11:00:00','2010-01-01 12:00:00','2010-01-01 13:00:00','2010-01-01 14:00:00','2010-01-01 15:00:00'],
'forecast_Dir':[150,150,150,150,150,150,180],
'windSpeed':[20,20,20,20,20,20,90]})
注意对于第一个区域, df['forecastTime'] = '2010-01-01 09:00:00' 和 df['forecastTime'] = '2010-01-01 15:00:00' 之间的小时数应该是自己的行。本质上,我希望通过上采样来填补缺失的时间。
编辑: - 原始数据框
IssueDatetime endtime \
0 2013-01-01 09:00:00 2013-01-03 03:00:00
1 2013-01-01 09:00:00 2013-01-03 03:00:00
2 2013-01-01 09:00:00 2013-01-03 03:00:00
3 2013-01-01 09:00:00 2013-01-03 03:00:00
4 2013-01-01 09:00:00 2013-01-03 03:00:00
... ... ...
53585 2016-12-30 09:00:00 2017-01-01 03:00:00
53586 2016-12-30 09:00:00 2017-01-01 03:00:00
53587 2016-12-30 09:00:00 2017-01-01 03:00:00
53588 2016-12-30 09:00:00 2017-01-01 03:00:00
53589 2016-12-30 09:00:00 2017-01-01 03:00:00
Regions forecastTime \
0 SOUTH COAST 2013-01-01 09:00:00
1 SOUTH COAST 2013-01-01 18:00:00
2 SOUTH COAST 2013-01-02 06:00:00
3 SOUTH COAST 2013-01-02 13:00:00
4 EAST COAST-CAPE ST FRANCIS AND SOUTH 2013-01-01 09:00:00
... ... ...
53585 SOUTHWESTERN GRAND BANKS 2016-12-30 18:00:00
53586 SOUTHWESTERN GRAND BANKS 2016-12-31 09:00:00
53587 SOUTHWESTERN GRAND BANKS 2016-12-31 15:00:00
53588 SOUTHWESTERN GRAND BANKS 2016-12-31 18:00:00
53589 SOUTHWESTERN GRAND BANKS 2017-01-01 00:00:00
forecastHour forecast_Dir forecast_WindSpeed_low \
0 0.0 270 35
1 9.0 270 25
2 21.0 225 15
3 28.0 270 35
4 0.0 270 35
... ... ... ...
53585 9.0 135 40
53586 24.0 135 40
53587 30.0 135 40
53588 33.0 315 25
53589 39.0 315 25
forecast_WindSpeed_gust forecast_WindSpeed_high \
0 None None
1 None None
2 None None
3 None None
4 None None
... ... ...
53585 None 50
53586 None 50
53587 None 50
53588 None 35
53589 None None
forecast_WindSpeed_exception_1_type forecast_Dir_exception_1 \
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
53585 NaN NaN
53586 OVER NORTHWESTERN SECTIONS 315
53587 NaN NaN
53588 NaN NaN
53589 NaN NaN
forecast_WindSpeed_low_exception_1 forecast_WindSpeed_high_exception_1
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
... ... ...
53585 NaN NaN
53586 25 None
53587 NaN NaN
53588 NaN NaN
53589 NaN NaN