Here's what I am suggesting. First, a full cartesian join on the two dfs:
df1.loc[:, 'MergeKey'] = 1 #create a mergekey
df2.loc[:, 'MergeKey'] = 1 #it is the same for both so that when you merge you get the cartesian product
#merge them to get the cartesian product (all possible combos)
merged = df1.merge(df2, on = 'MergeKey', suffixes = ['_1', '_2'])
Then, calculate the fuzz ratio for each combo:
def fuzzratio(row):
try: #avoid errors for example on NaN's
return fuzz.ratio(row['Billing Country_1'], row['Billing Country_2'])
except:
return 0. #you'll want to expiriment w/o the try/except too
merged.loc[:, 'Ratio'] = merged.apply(fuzzratio, axis = 1) #create ratio column by applying function
Now you should have a df with the ratio between all possible combinations of df1['Billing Country']
and df2['Billing Country']
. Once there, simply filter to get the ones where the ratio is 100%:
result = merged[merged.Ratio ==1]