尝试在此列中从into then中查找ID字符串:df1df2merge
key = df2['ID'].str.extract(fr"({'|'.join(df1['ID'].values)})", expand=False)
df1 = df1.merge(df2['Company'], left_on='ID', right_on=key, how='left').fillna('')
print(df1)
# Output:
Name ID Company
0 John AAA
1 Peter BAB Microsoft
2 Paul CCHF Google
3 Rosie R9D
详细信息:创建一个正则表达式df1['ID']以从中提取部分字符串df2['ID']:
# Regex pattern: try to extract the following pattern
>>> fr"({'|'.join(df1['ID'].values)})"
'(AAA|BAB|CCHF|R9D)'
# After extraction
>>> pd.concat([df2['ID'], key], axis=1)
ID ID
0 AEDSV NaN # Nothing was found
1 123BAB BAB # Found partial string BAB
2 CCHF-RB CCHF # Found partial string CCHF
3 YYYY NaN # Nothing was found
更新:
为了解决这个问题,我想知道是否可以基于 2 列进行合并。例如合并名称和ID?
key = df2['ID'].str.extract(fr"({'|'.join(df1['ID'].values)})", expand=False)
df1 = pd.merge(df1, df2[['Name', 'Company']], left_on=['Name', 'ID'],
right_on=['Name', key], how='left').drop_duplicates().fillna('')
print(df1)
# Output:
Name ID Region Company
0 John AAA A Microsoft
2 John AAA B Microsoft
4 Pat CCC C Dell
6 Sandra CCC D
7 Paul DD E
8 Sandra R9D F
9 Mia dfg4 G
10 Kim asfdh5 H
11 Louise 45gh I