问题标签 [pandas-apply]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

149 问题

0 投票

2 回答

2203 浏览

python-3.x - 使用 .apply() 创建列带有字符串的 Pandas

我有一个 Dataframe df。

其中一列被命名Adress并包含一个字符串。

我创建了一个函数processing(string)，它以字符串 a 作为参数返回该字符串的一部分。

我成功地将函数应用于 df 并在其中创建了一个新列df：

我以返回两个字符串的方式修改了我的函数processing(string)。我希望返回的第二个字符串存储在另一个新列中。为此，我尝试按照中给出的步骤：通过应用具有多个返回的函数创建多个熊猫数据框列

这是我的功能的一个例子processing(string)：

我还尝试在 a 中返回两个字符串tuple。

以下是我尝试将该函数应用于我的 df 的不同方法：

我想我很接近，但我不知道如何获得它。

2020-03-23T13:41:56.093

0 投票

1 回答

23 浏览

python - 在 python 中使用 DataFrame.apply 时如何访问以前的值？

这是我的代码

我想将我的 roll no 与之前的 rollno 相乘，像这样

我怎样才能做到这一点？

我知道我可以做这样的事情

但是上面的逻辑不能保证结果是否对应于上一行的乘法

python python-3.x pandas dataframe pandas-apply

2020-04-02T15:56:19.380

0 投票

2 回答

131 浏览

python - How to enter parameters into a function when using pandas apply

First time posting here - have decided to try and learn how to use python whilst on Covid-19 forced holidays.

I'm trying to summarise some data from a pretty simple database and have been using the value_counts function.

Rather than running it on every column individually, I'd like to loop it over each one and return a summary table. I can do this using df.apply(pd.value_counts) but can't work out how to enter parameters into the the value counts as I want to have dropna = False.

Basic example of data I have:

How I was doing the value counts for each column:

How can I enter the dropna=False when using apply function? I like the table it outputs below but want the NaN to appear in the list.

Any help would be appreciated!!

python pandas pandas-apply

2020-04-03T22:59:03.173

0 投票

1 回答

732 浏览

pandas - Pandas 基于时间窗合并两个时间序列数据帧（cut/bin/merge）

有 750k 行df15 列和一个pd.Timestampas indexcalled ts. 我以近乎实时的方式处理低至毫秒的实时数据。

现在我想将一些从更高时间分辨率派生的统计数据df_stats作为新列应用到大df. df_stats时间分辨率为 1 分钟。

目前我有下面的代码，但它效率低下，因为它需要遍历完整的数据。

我想知道使用pd.cut,bin是否有更简单的解决方案pd.Grouper？或者其他什么来合并两个索引上的时间段？

pandas dataframe merge pandas-groupby pandas-apply

2020-04-24T00:43:45.127

0 投票

2 回答

610 浏览

python-3.x - 将 fastText.get_sentence_vector 与 dask 并行化会产生酸洗错误

我试图使用 dask 的并行化机制为 8000 万条英语推文获取 fastText 句子嵌入，如本答案所述： How do you parallelize apply() on Pandas Dataframes using all cores on one machine?

这是我的完整代码：

这是get_fasttext_sentence_embedding函数：

但是，我在这一行得到一个酸洗错误：

这是我得到的错误：

有没有办法将 fastText 模型 get_sentence_vector 与 dask （或其他任何东西）并行化？我需要并行化，因为获取 8000 万条推文的句子嵌入需要两个时间，而且我的数据框的一行完全独立于另一行。

python-3.x pickle dask fasttext pandas-apply

2020-04-25T15:03:57.770

0 投票

2 回答

52 浏览

python - 根据条件连接熊猫中的行

我正在尝试将不以特定字符（'['）开头的行连接到以它开头的最近行。我已阅读txt文件如下：

开始 df ,

我希望得到

结束df .

python pandas pandas-apply

2020-05-06T02:33:54.587

0 投票

2 回答

43 浏览

python - 匹配另一个字符串中字符串中的所有单词（单词可以在不同的位置）

我有一个必须与数据框列匹配的字符串列表。

该列表如下所示：

数据框中的列如下所示：

我想从列表中找到包含每个单词的每一行，这样我就可以拥有下一个数据框：

我试图找到匹配项是在列表中提供字符串作为元素列表（如 ['street'、'view'、'wcdma']）并进行搜索：

但它什么也没给我，即使我知道必须至少有一场比赛。如果我将 all() 更改为 any() 它将返回 smth 但这不是我需要的。

python pandas dataframe pandas-apply

2020-05-12T16:40:22.717

0 投票

1 回答

67 浏览

python - 为什么这个列表理解只在 df.apply 中有效？

我正在尝试删除数据中的停用词。所以它会从这个开始

对此

关于如何做到这一点，我有两种选择。我正在分别尝试这两个选项，所以它不会覆盖任何东西。首先，我将一个函数应用于数据列。这行得通，它消除了我想做的事情。

第二个选项不使用应用函数参数。它和函数一模一样，但为什么会返回TypeError: unhashable type: 'list'？我检查if row not in stopwords该行中的原因是什么，因为当我删除它时，它会运行但它不会删除停用词

python pandas for-loop list-comprehension pandas-apply

2020-05-20T10:50:23.520

0 投票

1 回答

70 浏览

python - Groupby Apply Quantile Replacement

I'm trying to use python's pandas groupby, apply, where and quantile to replace values that fall below a 50% quantile with NaN by 'date' group however it seems to be returning lists in the cells. How can I get these results in a new column after the column 'value'.

This is the code I have (any other approaches are welcome). It returns lists in cells:

If I create a new column it returns NaN in new column:

I would like to get to this:

python pandas pandas-groupby quantile pandas-apply

2020-05-22T17:58:34.533

0 投票

1 回答

74 浏览

pandas - Pandas.Series.mode 最终会有多行结果。如何解决？

我有这个df：

我想获得具有相同nome_socio和的行的模式cnpj_cpf_socio。为此，我使用以下代码：

它确实找到了模式，但是由于对于Alexandre+AAA行，三者之间存在平局，因此municipios它返回三个不同的行。我得到这个结果：

我需要让它看起来像这样：

有没有办法做到这一点？

pandas pandas-groupby pandas-apply

2020-05-30T23:05:22.707

1 2 3 4 5 6 7 8 9 10

问题标签 [pandas-apply]

Reference