我想根据其范围的有效性选择我的原始数据。有一种仪器,最敏感的设置是C,然后是B,然后是A。所以从C开始,看看是否所有的值都小于阈值,如果是,则完美,将此灵敏度中的所有数据设置为best = 1.
from StringIO import StringIO
a = """category,val,sensitivity_level
x,20,A
x,31,B
x,60,C
x,20,A
x,25,B
x,60,C
y,20,A
y,40,B
y,60,C
y,20,A
y,24,B
y,30,C"""
df = pd.read_csv(StringIO(a))
def grp_1evel_1(x):
"""
return if all the elements are less than threshold
"""
return x<=30
def grp_1evel_0(x):
"""
Input: data grouped by category. Here I want to go through this categories, in an descending order,
that is C, B and then A, and wherever one of this categories has x<=30 valid for all elements select
that category as best category. Think about a device sensitivity, that at the highest sensitivity the
data maybe garbage, so you would like to move down the sensitivity and check again.
"""
x['islessthan30'] = x.groupby('sensitivity_level').transform(grp_1evel_1)
return x
print df.groupby('category').apply(grp_1evel_0)
但不幸的是,上面的代码不会产生这个矩阵,因为 - 我不能对 groupby 进行降序排序 - 我不能将值分配给 groupby 的 groupby
:
x,20,A,1
x,31,B,0
x,60,C,0
x,20,A,1
x,25,B,0
x,60,C,0
y,20,A,0
y,29,B,1
y,60,C,0
y,20,A,0
y,24,B,1
y,30,C,0
有什么提示吗?
算法应该如下
在一个类别中,从最高敏感度开始,如果所有值都小于阈值,则将此敏感度设置为 1,并跳过其他较低敏感度。