问题标签 [isin]

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

0 投票
2 回答
425 浏览

python - 如何检查数据框列是否包含来自另一个数据框列的字符串并返回 python pandas 中的相邻单元格?

我有 2 个数据框,一个包含我需要分类的一列字符串(df = 数据),另一个包含可能的类别和搜索词(df = 类别)。我想在“数据”数据框中添加一列,它会根据搜索词返回一个类别。例如:

数据:

类别:

期望的结果数据:

我已经用 apply 尝试了以下 lambda 函数。我不确定我的列引用是否在正确的位置:

但我不断收到错误消息:

这提供了基于 SearchTerm 的类别是否存在的真/假,但是我无法返回与搜索词关联的类别:

这两者有时都有效,但并非一直有效(也许是因为我的某些搜索词不止一个词?)

0 投票
2 回答
1273 浏览

python - pandas isin() 方法返回具有 NaN 值的行

我有一个数据框df_data和一个列表l_ids。如下df_data.head()所示:

在此处输入图像描述

并且l_lids[:5][224960004, 60032008, 26677001, 162213003, 72405004]

我想获取l_id列表中存在的行l_ids

所以我这样做:df_temp = df_data[df_data.isin(l_ids)]

但是,df_temp其中有包含NaN的行。实际上,text所有行的字段都是NaN。如下df_temp.head()所示:

在此处输入图像描述

交叉检查:

尽我们所能l_ids[0]is 224960004which is present indf_temp但它现在是 a float,对应text的是NaN。与79823003其他 id 相同。

为什么会这样?我过去也遇到过同样的错误,但我通过其他一些方式得到了行并忽略了错误。但是现在它再次发生在一个不相关的项目中,我觉得我在这里犯了某种错误。

额外信息

df_data.info()返回:

df_temp.info()返回:

因此l_id字段的数据类型从更改int64float64

0 投票
0 回答
95 浏览

python - 从另一个数据帧中删除数据帧子集无法与 pandas 合并和 isin 运算符一起正常工作

我希望从原始数据框中删除一部分数据。

我想要实现的示例:

预期产出

我提到了其他线程,例如 如何在 Python 中删除数据框的子集?但这对我不起作用。

方法1

我删除了子集中的一个额外列,并使用 pandas merge with indicator=True

我得到的输出的行数比预期的多,表明删除没有正确发生。

方法 2:我也尝试使用 'isin' 运算符

当我使用这种技术时,行数保持不变。那就是没有减少发生。

有人可以帮我吗?高度赞赏!谢谢

0 投票
1 回答
30 浏览

pandas - 在 Python/Pandas 中循环通过两个 csv 相同的代码仅适用于其中一个文件

我制作了一个代码来选择“公司合作伙伴”文件中的公司编号字段,然后与具有特定国家/地区的公司列表的文件进行比较,将结果写入第三个文件(最终结果:让合作伙伴该州的所有公司)。代码很简单:

问题是,我有两个版本的“empES.csv”文件。它们的列数不同,但都将字段“cnpj”作为第一列。这是我唯一需要的字段。当我运行传递版本 1 文件的代码时,它运行完美。但是,当我尝试打开版本 2 时,我的输出文件开始仅填充标题。带有标题的许多行!

以下是第一行的一些片段:

  1. 合作伙伴文件(socios.csv,我将从中复制匹配行的文件):

''' "cnpj","tipo_socio","nome_socio","cnpj_cpf_socio","cod_qualificacao","perc_capital","data_entrada","cod_pais_ext","nome_pais_ext","cpf_repres","nome_repres","cod_qualif_repres"

"00000000000191","2","MARCIO HAMILTON FERREIRA","* 923641 ","10",0.0,"20101117","","","","","00" "00000000000191","2 ","NILSON MARTINIANO MOREIRA","* 491386 ","10",0.0,"20101117","","","","","00" "00000000002135","2","DEBORA CRISTINA FONSECA ","* 314628 ","08",0.0,"20200312","","","","","00" "00000000002216","2","WALDERY RODRIGUES JUNIOR","* 025913 " ,"08",0.0,"20200312","","","","","00" "00000000002216","2","ERIK DA COSTA BREYER","* 093217 ","10",0.0,"20191209","",""," ","","00" "00000000002216","2","汤普森苏亚雷斯佩雷拉塞萨尔","* 503187 ","10",0.0,"20191209","","","","", "00" "00000000002569","2","WALTER MALIENI JUNIOR","* 718468"","","00" "00000000002216","2","汤普森苏亚雷斯佩雷拉塞萨尔","* 503187 ","10",0.0,"20191209","","","","" ,"00" "00000000002569","2","WALTER MALIENI JUNIOR","* 718468"","","00" "00000000002216","2","汤普森苏亚雷斯佩雷拉塞萨尔","* 503187 ","10",0.0,"20191209","","","","" ,"00" "00000000002569","2","WALTER MALIENI JUNIOR","* 718468","10",0.0,"20101117","","","","","00" "00000000002569","2","NILSON MARTINIANO MOREIRA","* 491386 ","10", 0.0,"20101117","","","","","00" "00000000002640","2","WALDERY RODRIGUES JUNIOR","* 025913 ","08",0.0,"20200312", "","","","","00" '''

  1. 工作公司文件 (empES.csv),我从中只读取了“cnpj”字段:

''' cnpj,identificador_matriz_filial,razao_social,nome_fantasia,situacao_cadastral,data_situacao_cadastral,motivo_situacao_cadastral,nome_cidade_exterior,codigo_natureza_juridica,data_inicio_atividade,cnae_fiscal,descricao_tipo_logradouro,logradouro,numero,complemento,bairro,cep,uf,codigo_municipio,municipio,ddd_telefone_1,ddd_telefone_2,ddd_fax,qualificacao_do_responsavel ,capital_social,porte,opcao_pelo_simples,data_opcao_pelo_simples,data_exclusao_do_simples,opcao_pelo_mei,situacao_especial,data_situacao_especial

2135,2,BANCO DO BRASIL SA,VITORIA - ES,2,2005-11-03,0,,2038,1966-08-01,6421200,PRACA,PIO XII,30,,CENTRO,29010340.0,ES,5705, VITORIA,,,,10,0.0,5,0,,,0,, 8338,2,BANCO DO BRASIL SA,CACHOEIRO DE ITAPEMIRIM-ES-EST UNIF,2,2005-11-03,0,,2038,1966 -08-01,6421200,PRACA,JERONIMO MONTEIRO,26,,CENTRO,29300902.0,ES,5623,CACHOEIRO DE ITAPEMIRIM,,,,10,0.0,5,0,,,0,, 11207,2,BANCO DO BRASIL SA,COLATINA-ES-EST.UNIF,2,2005-11-03,0,,2038,1966-08-01,6421200,RUA,EXPED ABILIO DOS SANTOS,124,,CENTRO,29700070.0,ES,5629,COLATINA ,,,,10,0.0,5,0,,,0,, 18643,2,BANCO DO BRASIL SA,,2,2005-11-03,0,,2038,1966-08-01,6421200,RUA, PRESIDENTE VARGAS,29,,CENTRO,29400000.0,ES,5667,MIMOSO DO SUL,,,,10,0.0,5,0,,,0,, 19615,2,BANCO DO BRASIL SA,,2,2005-11- 03,0,,2038,1982-05-04,6421200,AVENIDA,SENADOR EURICO RESENDE,994,,CENTRO,29845000.0,ES,5619,BOA ESPERANCA,,,,10,0.0,5,0,,,0, , 20974,2,BANCO DO BRASIL SA,SANTA TERESA ES-EST UNIF,2,2005-11-03,0,,2038,1966-08-01,6421200,RUA,JERONIMO VERVLOET,178,,CENTRO,29650000.0,ES,5691,SANTA特蕾莎,,,,10,0.0,5,0,,,0,, '''

  1. 新公司文件(empES.csv),这给了我奇怪的行为:

''' cnpj,matriz_filial,razao_social,nome_fantasia,situacao,data_situacao,motivo_situacao,nm_cidade_exterior,cod_pais,nome_pais,cod_nat_juridica,data_inicio_ativ,cnae_fiscal,tipo_logradouro,logradouro,numero,complemento,bairioro,cep_unicipio,cod_m1,cipioro,cep_unicipio,cod_m, ,ddd_2,telefone_2,ddd_fax,num_fax,email,qualif_resp,capital_social,porte,opc_simples,data_opc_simples,data_exc_simples,opc_mei,sit_especial,data_sit_especial

2135,2,BANCO DO BRASIL SA,VITORIA - ES,2,20051103,0,,,,2038,19660801,6421200,PRACA,PIO XII,30,,CENTRO,29010340.0,ES,5705,VITORIA,,,,, ,,AGE0021@BB.COM.BR,10,0.0,5,0,,,,, 8338,2,BANCO DO BRASIL SA,CACHOEIRO DE ITAPEMIRIM-ES-EST UNIF,2,20051103,0,,,,2038 ,19660801,6421200,PRACA,JERONIMO MONTEIRO,26,,CENTRO,29300902.0,ES,5623,CACHOEIRO DE ITAPEMIRIM,,,,,,,,10,0.0,5,0,,,,, 11207,2,BANCO DO BRASIL SA,COLATINA-ES-EST.UNIF,2,20051103,0,,,,2038,19660801,6421200,RUA,EXPED ABILIO DOS SANTOS,124,,CENTRO,29700070.0,ES,5629,COLATINA,,,,, ,,,10,0.0,5,0,,,,, 18643,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038,19660801,6421200,RUA,PRESIDENTE VARGAS,29,,CENTRO, 29400000.0,ES,5667,MIMOSO DO SUL,,,,,,,,10,0.0,5,0,,,,,, 19615,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038, 19820504,6421200,AVENIDA,SENADOR EURICO RESENDE,994,,CENTRO,29845000.0,ES,5619,BOA ESPERANCA,,,,,,,,10,0.0,5,0,,,,, 20974,2,BANCO DO BRASIL SA,SANTA TERESA ES-EST UNIF,2,20051103,0,,,,2038,19660801,6421200,RUA,JERONIMO VERVLOET,178,,CENTRO,29650000.0,ES ,5691,SANTA TERESA,,,,,,,,10,0.0,5,0,,,,, 22241,2,BANCO DO BRASIL SA,SAO MATEUS ES EST UNIF,2,20051103,0,,,,2038 ,19660801,6421200,AVENIDA,JONES DOS SANTOS NEVES,324,,CENTRO,29930010.0,ES,5697,SAO MATEUS,,,,,,,,10,0.0,5,0,,,,, 28100,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038,19660801,6421200,AVENIDA,JERONIMO MONTEIRO,38/46,,CENTRO,29500000.0,ES,5603,ALEGRE,,,,,,,,10,0.0 ,5,0,,,,, 37001,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038,19660801,6421200,RUA,DEMERVAL AMARAL,35,,CENTRO,29560000.0,ES,5645,瓜崔,,,,,,,,10,0.0,5,0,,,,, '''0.0,5,0,,,,, 22241,2,BANCO DO BRASIL SA,SAO MATEUS ES EST UNIF,2,20051103,0,,,,2038,19660801,6421200,AVENIDA,JONES DOS SANTOS NEVES,324,, CENTRO,29930010.0,ES,5697,SAO MATEUS,,,,,,,,10,0.0,5,0,,,,, 28100,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038 ,19660801,6421200,AVENIDA,JERONIMO MONTEIRO,38/46,,CENTRO,29500000.0,ES,5603,ALEGRE,,,,,,,,,10,0.0,5,0,,,,, 37001,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038,19660801,6421200,RUA,DEMERVAL AMARAL,35,,CENTRO,29560000.0,ES,5645,瓜翠,,,,,,,,10,0.0,5, 0,,,,, '''0.0,5,0,,,,, 22241,2,BANCO DO BRASIL SA,SAO MATEUS ES EST UNIF,2,20051103,0,,,,2038,19660801,6421200,AVENIDA,JONES DOS SANTOS NEVES,324,, CENTRO,29930010.0,ES,5697,SAO MATEUS,,,,,,,,10,0.0,5,0,,,,, 28100,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038 ,19660801,6421200,AVENIDA,JERONIMO MONTEIRO,38/46,,CENTRO,29500000.0,ES,5603,ALEGRE,,,,,,,,,10,0.0,5,0,,,,, 37001,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038,19660801,6421200,RUA,DEMERVAL AMARAL,35,,CENTRO,29560000.0,ES,5645,瓜翠,,,,,,,,10,0.0,5, 0,,,,, ''',,, 37001,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038,19660801,6421200,RUA,DEMERVAL AMARAL,35,,CENTRO,29560000.0,ES,5645,GUACUI,,,,, ,,,10,0.0,5,0,,,,, ''',,, 37001,2,BANCO DO BRASIL SA,,2,20051103,0,,,,2038,19660801,6421200,RUA,DEMERVAL AMARAL,35,,CENTRO,29560000.0,ES,5645,GUACUI,,,,, ,,,10,0.0,5,0,,,,, '''

这是我传递第一个 empES.csv 文件时的输出示例:

''' cnpj,tipo_socio,nome_socio,cnpj_cpf_socio,cod_qualificacao,perc_capital,data_entrada,cod_pais_ext,nome_pais_ext,cpf_repres,nome_repres,cod_qualif_repres

2135,2,WALDERY RODRIGUES JUNIOR,* 025913 ,8,0.0,20200312,,,,,0 2135,2,ERIK DA COSTA BREYER,* 093217 ,10,0.0,20191209,,,,,0 2135,2,THOMPSON SOARES PEREIRA CESAR,* 503187 ,10,0.0,20191209,,,,,0 2135,2,MAURICIO NOGUEIRA,* 894537 ,10,0.0,20191209,,,,,0 2135,2,DANIEL ANDRE STIELER,* 145110 , 10,0.0,20190910,,,,,0 2135,2,ENIO MATHIAS FERREIRA,* 078106 ,10,0.0,20181107,,,,,0 2135,2,RONALDO SIMON FERREIRA,* 685018 ,10,0.0,20190729, ,,,,0 2135,2,IVANDRE MONTIEL DA SILVA,* 975660 ,10,0.0,20190403,,,,,0 2135,2,FABIO AUGUSTO CANTIZANI BARBOSA,* 379967 ,10,0.0,20190403,,,,, 0 2135,2,卡洛斯·莫塔·多斯桑托斯,* 876287,10,0.0,20190403,,,,,0 2135,2,CAMILO BUZZI,* 569178 ,10,0.0,20190403,,,,,0 '''

当我尝试使用另一个“empES.csv”文件时,会发生以下情况:

''' j,tipo_socio,nome_socio,cnpj_cpf_socio,cod_qualificacao,perc_capital,data_entrada,cod_pais_ext,nome_pais_ext,cpf_repres,nome_repres,cod_qualif_repres cnpj,tipo_socio,nome_socio,cnpj_cpf_socio,cod_qualificacao,perc_capital,data_entrada,cod_pais_ext,nome_pais_ext,cpf_repres,nome_repres,cod_qualif_repres cnpj ,tipo_socio,nome_socio,cnpj_cpf_socio,cod_qualificacao,perc_capital,data_entrada,cod_pais_ext,nome_pais_ext,cpf_repres,nome_repres,cod_qualif_repres cnpj,tipo_socio,nome_socio,cnpj_cpf_socio,cod_qualificacao,perc_capital,data_entrada,cod_pais_ext,nome_pais_ext,cpf_repres,nome_repres,cod_qualif_repres cnpj,tipo_socio,nome_socio ,cnpj_cpf_socio,cod_qualificacao,perc_capital,data_entrada,cod_pais_ext,nome_pais_ext,cpf_repres,nome_repres,cod_qualif_repres cnpj,tipo_socio,nome_socio,cnpj_cpf_socio,cod_qualificacao,perc_capital,data_entrada,cod_pais_ext,nome_pais_ext,cpf_repres,nome_repres,cod_qualif_repres cnpj,tipo_socio,nome_socio,cnpj_cpf_socio,cod_qualificacao,perc_capital,data_entrada,cod_pais_ext,nome_pais_ext,cpf_repres,nome_repres'',cod

......永远这样下去。

我不知道为什么第一个在代码中运行良好以及为什么第二个给出该输出,就像 .isin 在这种情况下没有迭代一样!

有什么想法吗?

ps:这里提供的所有数据都是来自巴西政府的公共领域。

0 投票
3 回答
48 浏览

python - 如何使用熊猫获取与其计数匹配的单词

我有 2 个数据框,例如

我想在没有 forloop 的情况下比较这两个 Dataframes 并获得类似的输出

我可以使用哪种方式获取此输出任何示例.?? 在这里,我只想要显示的df2项目df1

只打印那些只喜欢的项目df3 ,我还需要得到计数(在熊猫上不可以没问题我只需要一个这样的列表df3也很好我使用了合并函数,但它显示了内部,外部,左连接,右连接,这些方法,所以不知道哪种方法更好

0 投票
2 回答
138 浏览

java - 使用 Spark / Java 的 isin() 函数

我有以下两个数据框。

我想要以下输出

我使用以下代码

注意 idZones 列是一个数组[int]

我收到这个错误

我需要你的帮助

谢谢

0 投票
0 回答
288 浏览

python - pandas `isin()` 函数对 pd.date_range 和 datetime 类型的奇怪行为

我试图用来isin()过滤我的df中的日期时间列。发现以下奇怪行为:

让我们定义一个具有唯一日期值的数据框:

我们将日期范围设置为 2 天:

日期预计在以下范围内:

但是,如果我们使用isin(),就会发生奇怪的事情!

通过进一步调查,我发现了更奇怪的行为:

只有当 period = 1、2 和 4 天时,它才会返回 False! 我强烈怀疑这是 Pandas lib 的错误。我正在pandas:1.0.5使用numpy:1.19.0.

顺便说一句,我们可以使用以下方法对其进行轮廓化date_range.date

*相关:
isin-function-does-not-work-for-dates。
问题 5021

0 投票
1 回答
60 浏览

python - isin函数跳过正确值python

我正在处理来自 get API 请求的响应 json 文件。我已经能够弄清楚如何展平响应,并且我想通过包含 pdf 文件扩展名的记录过滤相关的数据帧,我将使用这些文件扩展名来检索感兴趣的文件。这是代码:

即使应该给出肯定的结果,整个 df 也会返回 nan 。 在此处输入图像描述

我试图用相同的表达式过滤索引,但结果是一个空的 df。

哪里 isin 不在这里工作?

0 投票
1 回答
363 浏览

pandas - pandas isin() 在具有对象类型列的数据帧之间失败

为什么无法在 DataFrame 中找到“a”但可以在列表中找到?

PS:使用熊猫 1.0.5

在 [4] 中:pd。版本 [4]: '1.0.5'

0 投票
1 回答
334 浏览

python - 将字符串和列表都传递给 pandas .isin 方法

我正在尝试将字符串和列表都传递给 pandas .isin() 方法。这是我下面的代码

这里的问题是 .isin([]) 对字符串的每次迭代都很好,但是当我到达 general_months[-1] 时,它是一个列表,你不能将列表传递给 .isin([]) 语法。我试过这个但不能删除双引号,因为我的理解是字符串是不可变的:

这会产生:“'APR', 'JUL', 'NOV', 'MAR', 'FEB', 'AUG', 'SEP', 'OCT', 'JAN', 'DEC', 'MAY', 'JUN '" 如果是:'APR'、'JUL'、'NOV'、'MAR'、'FEB'、'AUG'、'SEP'、'OCT'、'JAN'、'十二月','五月','六月'

以最佳方式完成此任务的任何帮助?