我已使用此代码计算每个集群中每个用户的不同质量指标的值
>>> for name, group in df.groupby(["Cluster_id", "User"]):
... print 'group name:', name
... print 'group rows:'
... print group
... print 'counts of Quality values:'
... print group["Quality"].value_counts()
... raw_input()
...
但现在我得到的输出为
group rows:
tag user quality cluster
676 black fabric http://steve.nl/user_1002 usefulness-useful 1
708 blond wood http://steve.nl/user_1002 usefulness-useful 1
709 blond wood http://steve.nl/user_1002 problematic-misspelling 1
1410 eames? http://steve.nl/user_1002 usefulness-not_useful 1
1411 eames? http://steve.nl/user_1002 problematic-misperception 1
3649 rocking chair http://steve.nl/user_1002 usefulness-useful 1
3650 rocking chair http://steve.nl/user_1002 problematic-misperception 1
counts of Quality Values:
usefulness-useful 3
problematic-misperception 2
usefulness-not_useful 1
problematic-misspelling 1
我现在想做的是有一个检查条件,即:
if quality==usefulness-useful:
good = good + 1
else:
bad = bad + 1
我尝试编写输出:
counts of Quality Values:
usefulness-useful 3
problematic-misperception 2
usefulness-not_useful 1
problematic-misspelling 1
进入一个变量并尝试逐行遍历变量,但它不起作用。有人可以给我建议,关于如何对某些行进行计算。