我有一些 Pandas (python) 数据帧,它们是通过大约每 8 毫秒收集一次数据而创建的。数据被分解成块,序列重新开始。所有块都有一个标签,并且有一个时间戳列指示收集样本的时间(从文件的开头)。为了得到一个想法,框架看起来像这样:
| | EXPINDEX | EXPTIMESTAMP | DATA1 | DATA2 |
-----------------------------------------------------
| BLOCK | 0 | | | |
| Block1 | 1 | 0 | .423 | .926 |
| | 2 | 8.215 | .462 | .919 |
| | 3 | 17.003 | .472 | .904 |
| Block2 | 4 | 55.821 | .243 | .720 |
| | 5 | 63.521 | .237 | .794 |
| ... | ... | ... | ... | ... |
------------------------------------------------------
EXPTIMESTAMP 列是一个 DateTimeIndex。我想做的是稍后保留该列以供实用程序使用,但使用块相关的 DateTimeIndex 创建一个不同的子索引,例如:
| | | EXPTIMESTAMP | DATA1 | DATA2 |
----------------------------------------------------------
| BLOCK | BLOCKTIMESTAMP | | | |
| Block1 | 0 | 0 | .423 | .926 |
| | 8.215 | 8.215 | .462 | .919 |
| | 17.003 | 17.003 | .472 | .904 |
| Block2 | 0 | 55.821 | .243 | .720 |
| | 7.700 | 63.521 | .237 | .794 |
| ... | ... | ... | ... | ... |
----------------------------------------------------------
我已经得到了这个工作:
blockreltimestamp = []
blocks = list(df.index.levels[0])
for block in blocks:
dfblock = df.xs(block, level='BLOCK').copy()
dfblock["InitialVal"] = dfblock.iloc[0]["EXPTIMESTAMP"]
reltime = dfsblock["EXPTIMESTAMP"] - dfblock["InitialVal"]
blockreltimestamp.extend(list(reltime))
df["BLOCKTIMESTAMP"] = blockreltimestamp
df.set_index(["BLOCK","BLOCKTIMESTAMP"], drop=False, inplace=True)
但我想知道是否有一种更清洁/更有效/更熊猫式的方式来进行这种类型的转换。
谢谢!