我有一个熊猫数据框,df
具有以下列名称
columns = ['Baillie Gifford Positive Change Fund B Accumulation',
'Stewart Investors Worldwide Select Fund Class B (accumulation) Gbp',
'Stewart Investors Worldwide Select Fund Class A (accumulation) Gbp',
'Close Ftse Techmark Fund X Acc',
'Stewart Investors Asia Pacific Leaders Fund Class B (accumulation) Gbp',
'Stewart Investors Asia Pacific Leaders Fund Class A (accumulation) Gbp',
'Stewart Investors Worldwide Sustainability Fund Class A (accumulation) Gbp',
'Stewart Investors Worldwide Sustainability Fund Class B (accumulation) Gbp',
'Mi Somerset Emerging Markets Dividend Growth A Accumulation Shares',
'Axa Framlington Biotech Fund Gbp Z Acc',
'Stewart Investors Global Emerging Markets Sustainability Fund Class B (accumulation) Gbp',
'Schroder Asian Income Fund L Accumulation Gbp',
'Fidelity Active Strategy - Fast - Asia Fund Y-acc-gbp',
'Lf Miton Uk Value Opportunities Fund B Institutional Accumulation',
'Liontrust India Fund C Acc Gbp',
'Fidelity Asian Dividend Fund W Acc',
'Stewart Investors Global Emerging Markets Sustainability Fund Class A (accumulation) Gbp',
'Quilter Investors Emerging Markets Equity Growth Fund U2 (gbp) Accumulation',
'Man Glg Continental European Growth Fund Retail Accumulation Shares (class A)',
'Quilter Investors Europe (ex Uk) Equity Growth Fund A (gbp) Accumulation']
我想要的是过滤相似的列并保留其中之一。
例如,'Stewart Investors Worldwide Select Fund Class B (accumulation) Gbp'
, 与 相同'Stewart Investors Worldwide Select Fund Class A (accumulation) Gbp'
,
我在想 NLP 中用于识别相似文本的一些相似性分数可能在这里有所帮助。但我不知道如何在我的情况下申请。
预期的结果应该是一个列表(我将用它来过滤我的数据框),其中包含一个相似的文本。例如:
columns_filtered = ['Baillie Gifford Positive Change Fund B Accumulation',
'Stewart Investors Worldwide Select Fund Class B (accumulation) Gbp',
'Close Ftse Techmark Fund X Acc',
'Stewart Investors Asia Pacific Leaders Fund Class A (accumulation) Gbp',
'Stewart Investors Worldwide Sustainability Fund Class B (accumulation) Gbp',
'Mi Somerset Emerging Markets Dividend Growth A Accumulation Shares',
'Axa Framlington Biotech Fund Gbp Z Acc',
'Stewart Investors Global Emerging Markets Sustainability Fund Class B (accumulation) Gbp',
'Schroder Asian Income Fund L Accumulation Gbp',
'Fidelity Active Strategy - Fast - Asia Fund Y-acc-gbp',
'Lf Miton Uk Value Opportunities Fund B Institutional Accumulation',
'Liontrust India Fund C Acc Gbp',
'Fidelity Asian Dividend Fund W Acc',
'Stewart Investors Global Emerging Markets Sustainability Fund Class A (accumulation) Gbp',
'Quilter Investors Emerging Markets Equity Growth Fund U2 (gbp) Accumulation',
'Man Glg Continental European Growth Fund Retail Accumulation Shares (class A)',
'Quilter Investors Europe (ex Uk) Equity Growth Fund A (gbp) Accumulation']
有什么帮助吗?