python-2.7 - 根据列中的 True 值过滤行 - python pandas 数据框

Question

我正在使用熊猫数据框。我有兴趣根据应用于现有数据名列的条件获取新数据框。这是数据框：

users_df
Out[30]: 
<class 'pandas.core.frame.DataFrame'>
Index: 3595 entries,
Data columns (total 9 columns):
screen_name        3595  non-null values

User_Desc          3595  non-null values

lang               3595  non-null values
followers_count    3579  non-null values
friends_count      3580  non-null values
listed_count       2665  non-null values
statuses_count     3595  non-null values
stem_key_flag      3595  non-null values
stem_keys          3595  non-null values
dtypes: bool(1), float64(3), int64(1), object(4)

我正在做的是

en_users_df = users_df[users_df['stem_key_flag']==True]

但我得到的答案与顶级代码块完全相同。这意味着它不过滤任何东西。我是否正在做一些在早期版本中兼容但现在不兼容的事情？如果不是，我犯了什么错误？

我还尝试了对另一列 int 数据类型的类似方法，它工作正常。

fol_cnt_users_df = users_df[users_df['followers_count'] >1000]

In [35]: fol_cnt_users_df
Out[35]: 
<class 'pandas.core.frame.DataFrame'>
Index: 724 entries, 2013-06-20, 12:13:46 to 2013-06-19, 18:26:48
Data columns (total 9 columns):
screen_name        724  non-null values
User_Desc          724  non-null values
lang               724  non-null values
followers_count    724  non-null values
friends_count      722  non-null values
listed_count       714  non-null values
statuses_count     724  non-null values
stem_key_flag      724  non-null values
stem_keys          724  non-null values
dtypes: bool(1), float64(3), int64(1), object(4)

我在这里先向您的帮助表示感谢。

score 1 · Accepted Answer

您的问题可能是版本问题（我假设您使用的是0.10or 0.11）。我已经测试了您的代码，如果stem_key_flag列包含任何False值，那么它应该返回一个不同的数据框。但是，由于该线程变得适度流行，为了将来的访问者，我想声明您的过滤行（如下所述）是正确的：

en_users_df = users_df[users_df['stem_key_flag']==True]

尽管如此，您将使用更简单的线获得相同的结果，例如

en_users_df = users_df[users_df.stem_key_flag]

python-2.7 - 根据列中的 True 值过滤行 - python pandas 数据框

1 回答 1

Related

Reference