python - python中空白/标点符号的小问题？

Question

我有这个功能可以将文本语言转换为英语：

def translate(string):
    textDict={'y':'why', 'r':'are', "l8":'late', 'u':'you', 'gtg':'got to go',
        'lol': 'laugh out    loud', 'ur': 'your',}
    translatestring = ''
    for word in string.split(' '):
        if word in textDict:
            translatestring = translatestring + textDict[word]
        else:
            translatestring = translatestring + word
    return translatestring

但是，如果我想翻译y u l8?它会返回whyyoul8?. 当我返回它们时，我将如何分隔单词，以及如何处理标点符号？任何帮助表示赞赏！

score 2 · Accepted Answer

oneliner comprehension:

''.join(textDict.get(word, word) for word in re.findall('\w+|\W+', string))

[Edit] Fixed regex.

score 0 · Accepted Answer

你可以只添加一个+ ' ' +来添加一个空间。但是，我认为您要做的是：

import re

def translate_string(str):
    textDict={'y':'why', 'r':'are', "l8":'late', 'u':'you', 'gtg':'got to go',  'lol': 'laugh out loud', 'ur': 'your',}
    translatestring = ''
    for word in re.split('([^\w])*', str):
        if word in textDict:
            translatestring += textDict[word]
        else:
            translatestring += word

    return translatestring


print translate_string('y u l8?')

这将打印：

why you late?

此代码更优雅地处理问号之类的内容，并保留输入字符串中的空格和其他字符，同时保留您的原始意图。

score 0 · Accepted Answer

我想建议以下替换此循环：

for word in string.split(' '):
    if word in textDict:
        translatestring = translatestring + textDict[word]
    else:
        translatestring = translatestring + word

对于 string.split(' ') 中的单词： translatetring += textDict.get(word, word)

将在字典中查找并使用dict.get(foo, default)if尚未定义。foodefaultfoo

（运行时间，现在简短说明：拆分时，您可以根据标点符号和空格进行拆分，保存标点符号或空格，并在加入输出字符串时重新引入它。这有点工作，但它会把工作做完。）

score 0 · Accepted Answer

您正在将单词添加到没有空格的字符串中。如果您打算以这种方式做事（而不是您在上一个关于此主题的问题中建议的方式），您需要手动重新添加空格，因为您拆分它们。

score 0 · Accepted Answer

"yu l8" 在 " " 上拆分，给出 ["y", "u", "l8"]。替换后，你得到 ["why", "you", "late"] - 你在不添加空格的情况下连接它们，所以你得到 "whyyoulate"。if 的两个分支都应该插入一个空格。

python - python中空白/标点符号的小问题？

5 回答 5

Related

Reference