count - pandas python如何计算数据框中的记录或行数

Question

显然是熊猫的新手。我怎样才能简单地计算数据框中的记录数。

我会想到一些像这样简单的事情就可以做到，我似乎什至无法在搜索中找到答案......可能是因为它太简单了。

cnt = df.count
print cnt

上面的代码实际上只是打印了整个 df

score 38 · Accepted Answer

要获取数据框中的行数，请使用：

df.shape[0]

（并df.shape[1]获取列数）。

作为替代方案，您可以使用

len(df)

或者

len(df.index)

（len(df.columns)对于列）

shape比更通用，更方便len()，尤其是对于交互式工作（只需要在最后添加），但len速度更快（另见此答案）。

避免：count()因为它返回请求轴上的非 NA/null 观察数

len(df.index)是比较快的

import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(24).reshape(8, 3),columns=['A', 'B', 'C'])
df['A'][5]=np.nan
df
# Out:
#     A   B   C
# 0   0   1   2
# 1   3   4   5
# 2   6   7   8
# 3   9  10  11
# 4  12  13  14
# 5 NaN  16  17
# 6  18  19  20
# 7  21  22  23

%timeit df.shape[0]
# 100000 loops, best of 3: 4.22 µs per loop

%timeit len(df)
# 100000 loops, best of 3: 2.26 µs per loop

%timeit len(df.index)
# 1000000 loops, best of 3: 1.46 µs per loop

df.__len__ 只是一个电话len(df.index)

import inspect 
print(inspect.getsource(pd.DataFrame.__len__))
# Out:
#     def __len__(self):
#         """Returns length of info axis, but here we use the index """
#         return len(self.index)

为什么你不应该使用count()

df.count()
# Out:
# A    7
# B    8
# C    8

score 27 · Accepted Answer

关于您的问题...计算一个字段？我决定把它作为一个问题，但我希望它有帮助......

假设我有以下 DataFrame

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.normal(0, 1, (5, 2)), columns=["A", "B"])

你可以计算一列

df.A.count()
#or
df['A'].count()

两者都评估为 5。

很酷的事情（或许多 wrt 之一pandas）是，如果您有NA值，则 count 会考虑到这一点。

所以如果我这样做了

df['A'][1::2] = np.NAN
df.count()

结果将是

 A    3
 B    5

score 10 · Accepted Answer

简单地说，row_num = df.shape[0] # 给出行数，下面是例子：

import pandas as pd
import numpy as np

In [322]: df = pd.DataFrame(np.random.randn(5,2), columns=["col_1", "col_2"])

In [323]: df
Out[323]: 
      col_1     col_2
0 -0.894268  1.309041
1 -0.120667 -0.241292
2  0.076168 -1.071099
3  1.387217  0.622877
4 -0.488452  0.317882

In [324]: df.shape
Out[324]: (5, 2)

In [325]: df.shape[0]   ## Gives no. of rows/records
Out[325]: 5

In [326]: df.shape[1]   ## Gives no. of columns
Out[326]: 2

score 2 · Accepted Answer

上面的 Nan 示例遗漏了一件，这使得它不那么通用。要更“一般”地执行此操作，请使用df['column_name'].value_counts() 这将为您提供该列中每个值的计数。

d=['A','A','A','B','C','C'," " ," "," "," "," ","-1"] # for simplicity

df=pd.DataFrame(d)
df.columns=["col1"]
df["col1"].value_counts() 
      5
A     3
C     2
-1    1
B     1
dtype: int64
"""len(df) give you 12, so we know the rest must be Nan's of some form, while also having a peek into other invalid entries, especially when you might want to ignore them like -1, 0 , "", also"""

score 0 · Accepted Answer

0

获取记录数的简单方法：

df.count()[0]

于 2020-11-02T17:43:30.617 回答

count - pandas python如何计算数据框中的记录或行数

5 回答 5

Related

Reference