我想以按列分组的频率data
使用前向填充对列进行重新采样:ffill
1min
df
id
df
:
id timestamp data
1 1 2017-01-02 13:14:53.040 10.0
2 1 2017-01-02 16:04:43.240 11.0
...
4 2 2017-01-02 15:22:06.540 1.0
5 2 2017-01-03 13:55:34.240 2.0
...
预期输出:
id timestamp data
1 1 2017-01-02 13:14:53.040 10.0
2017-01-02 13:14:54.040 10.0
2017-01-02 13:14:55.040 10.0
2017-01-02 13:14:56.040 10.0
...
2 1 2017-01-02 16:04:43.240 11.0
2017-01-02 16:04:44.240 11.0
2017-01-02 16:04:45.240 11.0
2017-01-02 16:04:46.240 11.0
...
4 2 2017-01-02 15:22:06.540 1.0
2017-01-02 15:22:07.540 1.0
2017-01-02 15:22:08.540 1.0
2017-01-02 15:22:09.540 1.0
...
5 2 2017-01-03 13:55:34.240 2.0
2017-01-03 13:55:35.240 2.0
2017-01-03 13:55:36.240 2.0
2017-01-03 13:55:37.240 2.0
...
类似这篇文章的东西,但我试过了:
df.set_index('timestamp').groupby('id').resample('1min').asfreq().drop(['id'], 1).reset_index()
并且data
列仅返回NaN
值:
id timestamp data
0 1 2017-01-02 13:14:53.040 NaN
1 1 2017-01-02 13:14:54.040 NaN
2 1 2017-01-02 13:14:55.040 NaN
3 1 2017-01-02 13:14:56.040 NaN
4 1 2017-01-02 13:14:57.040 NaN
... ... ... ...
编辑:
- 第二行
df
timestamp
从2017-01-02 12:04:43.240
变为2017-01-02 16:04:43.240
,即属于同一行的行id
应该被排序。 - 我误认为预期输出中的第二个是最小的,但@jezrael 的答案是正确的。