python - Function not returning same result on each run

Question

I am trying to replace special characters with HTML entities, But the results are random with the same input and I don't understand why.

Here is the code :

def secure(text):
    hsconvert = {"\'": "\\'", "\"": "\\\"", "¢": "&cent;", "©": "&copy;", "÷": "&divide;", ">": "&gt;", "<": "&lt;", "µ": "&micro;", "·": "&middot;", "¶": "&para;", "±": "&plusmn;", "€": "&euro;", "£": "&pound;", "®": "&reg;", "§": "&sect;", "™": "&trade;", "¥": "&yen;", "á": "&aacute;", "Á": "&Aacute;", "à": "&agrave;", "À": "&Agrave;", "â": "&acirc;", "Â": "&Acirc;", "å": "&aring;", "Å": "&Aring;", "ã": "&atilde;", "Ã": "&Atilde;", "ä": "&auml;", "Ä": "&Auml;", "æ": "&aelig;", "Æ": "&AElig;", "ç": "&ccedil;", "Ç": "&Ccedil;", "é": "&eacute;", "É": "&Eacute;", "è": "&egrave;", "È": "&Egrave;", "ê": "&ecirc;", "Ê": "&Ecirc;", "ë": "&euml;", "Ë": "&Euml;", "í": "&iacute;", "Í": "&Iacute;", "ì": "&igrave;", "Ì": "&Igrave;", "î": "&icirc;", "Î": "&Icirc;", "ï": "&iuml;", "Ï": "&Iuml;", "ñ": "&ntilde;", "Ñ": "&Ntilde;", "ó": "&oacute;", "Ó": "&Oacute;", "ò": "&ograve;", "Ò": "&Ograve;", "ô": "&ocirc;", "Ô": "&Ocirc;", "ø": "&oslash;", "Ø": "&Oslash;", "õ": "&otilde;", "Õ": "&Otilde;", "ö": "&ouml;", "Ö": "&Ouml;", "ß": "&szlig;", "ú": "&uacute;", "Ú": "&Uacute;", "ù": "&ugrave;", "Ù": "&Ugrave;", "û": "&ucirc;", "Û": "&Ucirc;", "ü": "&uuml;", "Ü": "&Uuml;", "ÿ": "&yuml;", "\\":"\\\\"};
    for i, j in hsconvert.items():
        text = text.replace(i, j)
        return text

print(secure("La Vie d'Adèle, chapitres 1 & 2"))

Here are the console outputs:

>>> ================================ RESTART ================================
>>> 
La Vie d\'Ad&egrave;le, chapitres 1 & 2
['TV Movie', 'Video Game', 'TV Episode', 'TV Series', 'TV Series ', 'Short', 'TV Mini-Series']
>>> ================================ RESTART ================================
>>> 
La Vie d\\'Ad&egrave;le, chapitres 1 & 2
['TV Movie', 'Video Game', 'TV Episode', 'TV Series', 'TV Series ', 'Short', 'TV Mini-Series']

The problem is with the ' character which is sometimes returned as \' and sometimes as \\'.

I think it is coming from the last item in the dictionary, "\\":"\\\\" but I don't understand why it is not interpreted the same on each run.

score 3 · Accepted Answer

正如您在回答中推测的那样，问题在于字典上的迭代没有定义的顺序。

从Python 3 文档：

对字典执行 list(d.keys()) 以任意顺序返回字典中使用的所有键的列表（如果要对其进行排序，只需使用 sorted(d.keys()) 代替）。

它没有明确说明，但同样适用于 items()。

在这种情况下，看到迭代之间的顺序发生变化，我有点惊讶，但在这种情况下，任意意味着未定义——任何顺序在技术上都是有效的。如果您想要一致的结果，我建议您重新设计您的算法，使其对项目的顺序完全不敏感；如果做不到这一点，首先对输出进行排序或使用 OrderedDict 至少可以解决一致性问题。

score 0 · Accepted Answer

有时，您的代码首先替换\\为\\\\，然后\'替换为\\'. 有时它会反其道而行之。

示例（使用“\'”作为输入）：

如果我们先\\-> \\\\，然后\'->\\'我们\'在第一次尝试替换之后得到（因为没有 a ，所以什么都没有发生\\），然后\\'在第二次之后。

但是如果我们反过来做，我们会得到\\'第一个，然后它用第二个替换\\，\\\\所以我们最终得到\\\\'!

发生这种情况是因为hsconvert它是一个字典，所以它没有排序，并且遍历它（for循环）不一定每次都以相同的方式发生。

你解决它的方法很好，但为了将来参考，模块中有OrderedDict一个collections。

score 0 · Accepted Answer

我已按如下方式修改了函数并且它正在工作：

def secure(text):
    text.replace("\\", "\\\\")
    hsconvert = {"\'": "\\'", "\"": "\\\"", "¢": "&cent;", "©": "&copy;", "÷": "&divide;", ">": "&gt;", "<": "&lt;", "µ": "&micro;", "·": "&middot;", "¶": "&para;", "±": "&plusmn;", "€": "&euro;", "£": "&pound;", "®": "&reg;", "§": "&sect;", "™": "&trade;", "¥": "&yen;", "á": "&aacute;", "Á": "&Aacute;", "à": "&agrave;", "À": "&Agrave;", "â": "&acirc;", "Â": "&Acirc;", "å": "&aring;", "Å": "&Aring;", "ã": "&atilde;", "Ã": "&Atilde;", "ä": "&auml;", "Ä": "&Auml;", "æ": "&aelig;", "Æ": "&AElig;", "ç": "&ccedil;", "Ç": "&Ccedil;", "é": "&eacute;", "É": "&Eacute;", "è": "&egrave;", "È": "&Egrave;", "ê": "&ecirc;", "Ê": "&Ecirc;", "ë": "&euml;", "Ë": "&Euml;", "í": "&iacute;", "Í": "&Iacute;", "ì": "&igrave;", "Ì": "&Igrave;", "î": "&icirc;", "Î": "&Icirc;", "ï": "&iuml;", "Ï": "&Iuml;", "ñ": "&ntilde;", "Ñ": "&Ntilde;", "ó": "&oacute;", "Ó": "&Oacute;", "ò": "&ograve;", "Ò": "&Ograve;", "ô": "&ocirc;", "Ô": "&Ocirc;", "ø": "&oslash;", "Ø": "&Oslash;", "õ": "&otilde;", "Õ": "&Otilde;", "ö": "&ouml;", "Ö": "&Ouml;", "ß": "&szlig;", "ú": "&uacute;", "Ú": "&Uacute;", "ù": "&ugrave;", "Ù": "&Ugrave;", "û": "&ucirc;", "Û": "&Ucirc;", "ü": "&uuml;", "Ü": "&Uuml;", "ÿ": "&yuml;"};
    for i, j in hsconvert.items():
        text = text.replace(i, j)
    return text

但我不明白为什么旧功能不起作用... A for x in ... 并不总是相同的顺序？

python - Function not returning same result on each run

3 回答 3

Related

Reference