0

Possible Duplicate:
Matching incorrectly spelt words with correct ones in python

I have to interpret an incoming SMS that looks something like these:

SHOP NAME : CITY

Annies pet shop new york

Budds Calerfonia

Kelvins Boat Shop San Fransico

Karel Boom West palm beach

I have a list of cities and a list of shop names that I have to compare the sms with, if the shop name is there, great, is the city is there, perfect.

Now the thing is, people will spell these wrong etc. And because there is no separator like a comma, how would i know where the word is, starts and stops ?

I have looked at using the levenshtein function, and that returns the closest match in a list. But what if there is no match? Then I have to tell the user, sory, nothing matches your sms etc etc.

How will you go about doing that? Bare in mind, each sms campaign might have different number of parameters.

4

3 回答 3

0

1)我认为没有办法修复所有错误,你需要决定你要修复什么样的错误以及可以用于数据的格式。不要让它太模糊。对于非常模糊的预测,您可能会认为垃圾是有效的,并且很难理解决策路径和修复错误。

2) 有几种模糊匹配方式。我建议您接下来查看:https ://stackoverflow.com/questions/682367/good-python-modules-for-fuzzy-string-comparison

3)将所有空格换行符和额外字符替换为单个空格。标记您的文本会更容易。

于 2012-07-20T12:40:14.063 回答
0

如果没有匹配,那么您可以手动检查短信或自动发回短信,表明商店/城市无法识别。如果您识别其中一个,则可以添加一些规则以猜测另一个参数。例如,如果城市被识别,然后查看该城市是否只有一家商店并自动添加...我建议您在属性之间添加某种分隔符..例如用逗号SHOP, CITY

于 2012-07-20T12:37:49.163 回答
0

如果传入的 SMS 在每行之后都有 \n,您可以将其拆分。

于 2012-07-20T12:23:41.240 回答