2

我有一个名为的列y_ocsvm,在名为 的 df 中填充了 1 和 -1 step1

我使用:step1['y_ocsvm'].value_counts()来获取 1 和 -1 的计数,输出为:

step1['y_ocsvm'].value_counts()
Out[11]: 
 1    1622
-1     426
Name: y_ocsvm, dtype: int64

我想找到-1的数量与1的数量之比。我可以简单地做 426/1622,但由于我必须将它用于许多数据帧,所以值肯定会有所不同,这使得手动计算相同变得困难。

由于value_counts()只能应用于熊猫系列,我尝试这样做:

pd.Series([step1['y_ocsvm'] == -1]).value_counts()

但我收到以下错误:

pd.Series([step1['y_ocsvm'] == -1]).value_counts()
Traceback (most recent call last):

  File "<ipython-input-13-59f772263a54>", line 1, in <module>
    pd.Series([step1['y_ocsvm'] == -1]).value_counts()

  File "C:\Users\kashy\Anaconda3\envs\py36\lib\site-packages\pandas\core\base.py", line 1303, in value_counts
    normalize=normalize, bins=bins, dropna=dropna)

  File "C:\Users\kashy\Anaconda3\envs\py36\lib\site-packages\pandas\core\algorithms.py", line 705, in value_counts
    keys, counts = _value_counts_arraylike(values, dropna)

  File "C:\Users\kashy\Anaconda3\envs\py36\lib\site-packages\pandas\core\algorithms.py", line 750, in _value_counts_arraylike
    keys, counts = f(values, dropna)

  File "pandas/_libs/hashtable_func_helper.pxi", line 348, in pandas._libs.hashtable.value_count_object

  File "pandas/_libs/hashtable_func_helper.pxi", line 359, in pandas._libs.hashtable.value_count_object

  File "C:\Users\kashy\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py", line 1816, in __hash__
    ' hashed'.format(self.__class__.__name__))

SystemError: <built-in method format of str object at 0x00000203B7063AC0> returned a result with an error set

我想知道如何使用熊猫来做到这一点?

4

2 回答 2

3

这里Series构造函数不是必需的,因为step1['y_ocsvm'] == -1Series由布尔值填充:

out = (step1['y_ocsvm'] == -1).value_counts()

对于比率是可能的使用:

print (out[True] / out[False])
于 2019-05-11T10:19:23.593 回答
2

你也可以做

step1['y_ocsvm'].value_counts()[-1] / step1['y_ocsvm'].value_counts()[1]
于 2019-05-11T10:32:17.460 回答