1

Hi I have a table (see screenshot for an extract of it - it has many more rows) in pandas and wish to pull out unique 'author_id' and then run a function to pull details associated with each ID.

I extract the list of unique ids by:

unique_ids = df['author_id'].unique()

Then I attempt to run:

df['author_id'].unique().apply(some_function)

Where 'some_function' takes the 'author_id' and returns some info. But I get the error:

AttributeError: 'numpy.ndarray' object has no attribute 'apply'

So I am resorting to:

[some_function(author_id) for author_id in unique_ids]

Which works but isnt the efficient/vectorised way of doing this.

What is the way to do this in a vectorised way?

Thanks in advance!enter image description here

4

2 回答 2

1

我想你想做一个groupby

g = df.groupby('author_id')

g.apply(some_function)
于 2013-06-05T10:34:10.770 回答
1

unique 函数的输出是一个 numpy 数组,它不提供 apply 方法。您可以通过该数组创建一个Series,然后应用您的函数:

pd.Series(df['author_id'].unique()).apply(some_function)
于 2013-06-05T10:47:18.713 回答