python - 替换 CSV 中的整个字符串

Question

当我运行此代码来编辑我的 CSV 文件时，即使我的字典中有字符串，也只有部分字符串被替换。

import re

def replace_all(text, dic):
    for i, j in dic.iteritems():
        text = text.replace(i, j)
    return text

bottle = "vial jug canteen urn jug33"
transport = "car automobile airplane scooter"

mydict = {}
for word in bottle.split():
    mydict[word] = 'bottle'
for word in transport.split():
    mydict[word] = 'transport'
print(mydict) # test


with open('replacesample.csv','r') as f:
    text=f.read()
    text=replace_all(text,mydict)
    text=re.sub(r'PROD\s(?=[1-9])',r'PROD',text)

with open('file2.csv','w') as w:
    w.write(text)

例如，如果我的字符串 CSV 是这样的：

jug 
canteen 
urn
car
automobile
swag
airplane
jug33

我的最终结果是：

bottle 
bottle 
bottle
transport
transport
swag
transport
bottle33

我该如何解决？

预期的：

bottle 
bottle 
bottle
transport
transport
swag
transport
bottle

score 0 · Accepted Answer

您正在使用字典来枚举替换模式。字典以任意顺序返回键和值。

因此，- jug>bottle替换发生在jug33->bottle替换之前。这种替换也适用于部分单词。

解决方案是按长度相反的顺序对键进行排序，以确保首先替换较长的匹配项：

def replace_all(text, dic):
    for i, j in sorted(dic.iteritems(), key=lambda i: len(i[0]), reverse=True):
        text = text.replace(i, j)
    return text

演示：

>>> def replace_all(text, dic):
...     for i, j in dic.iteritems():
...         text = text.replace(i, j)
...     return text
... 
>>> replace_all('jug33 jug', mydict)
'bottle33 bottle'
>>> def replace_all(text, dic):
...     for i, j in sorted(dic.iteritems(), key=lambda i: len(i[0]), reverse=True):
...         text = text.replace(i, j)
...     return text
... 
>>> replace_all('jug33 jug', mydict)
'bottle bottle'

python - 替换 CSV 中的整个字符串

1 回答 1

Related

Reference