1

我有两列的数据框。第一个是正确的字符串,第二个是损坏的。我想应用 Jaro-Winkler 距离并将其存储在新的第三列中。

import pandas as pd
from pyjarowinkler.distance import get_jaro_distance

df = pd.DataFrame(
        {"Correct" : ['Hello' , 'bread' , 'situation'],
         "Corrupt" : ['Hlloe' , 'braed' , 'sitatuion']},
        index = [1, 2, 3])
4

1 回答 1

4
df['res'] = [get_jaro_distance(x, y) for x, y in zip(df['Correct'], df['Corrupt'])]
    Correct Corrupt res
1   Hello   Hlloe   0.88
2   bread   braed   0.95
3   situation   sitatuion   0.97

于 2019-08-05T14:14:21.637 回答