我在熊猫(python)中有一个数据框,它是来自具有时间索引的实验的测量变量。我希望提取出该值低于某个值的时间。然而,噪音有时会导致变量高于和低于阈值,所以我也只想在变量高于另一个阈值时找到一个新的时间点。到目前为止我写的代码是:
def findPriming(df,col,sphigh,splow):
#start the counter and the pastPrime detector
i = 1 # this ignores the first value but lets us check with the one before with no errors.
currentlyPriming = False
primeTimes = []
#Right iteratre through the series here:
while i < range(len(df)):
# If the value is above 20, everything is fine and its not priming
if df[col].iloc[i] > sphigh:
currentlyPriming = False
#If its below 16:
elif df[col].iloc[i] < splow:
#Check if we are currently priming:
if not currentlyPriming:
# We are now priming and haven't been before. So let's log it
primeTimes.append(df.index[i])
# Now we are priming we need to set the flag!
currentlyPriming = True
# Nowincrement the counter
i += 1 # Increment counter
return primeTimes
但我可以想象这是非常低效的(而且它需要永远运行的事实会告诉我同样的事情)。
我试图考虑如何删除这两个 if 的每个数据点,但无法使其正常工作。
有人对改进有任何想法吗?我试图搜索类似的代码,但似乎找不到任何东西。
编辑以包含我的数据框的示例:
DateTime Data
2013-08-08 15:46:41 25.203461
2013-08-08 15:46:51 23.241514
2013-08-08 15:47:01 22.256216
2013-08-08 15:47:11 21.256216
2013-08-08 15:47:21 16.261763
2013-08-08 15:47:31 13.249237
2013-08-08 15:47:41 17.249237
2013-08-08 15:47:51 18.238962
2013-08-08 15:48:01 13.207640
2013-08-08 15:48:11 20.207640
以及我(严重)绘制的示例图的链接 [inlined --ed]