我已经将一堆 markdown 格式的注释备份到一个 XML 文档中。这当然意味着我需要对它们进行 HTML 转义。当我尝试使用 CGI.unescapeHTML 时,它会在标记中添加一堆奇怪的字符,这些字符在所有浏览器中都不能很好地呈现。
具体来说,它用“\302\240”替换了两个空格,但不一致。我如何让它停止这种行为?
例如:
s = "I am seeing more and more <a href="http://github.com/aslakhellesoy/cucumber /tree/master">Cucumber</a> usage. This is a good thing! But I'm also seeing people who are not using regular expressions to their fullest. Here are some quick regex tips to keep you features readable:

* `(?:a|an)` -- using a this construct you can group things wihout actually matching them. I'm seeing a lot of steps that have unused params because someone needed a group but didn't know how to avoid capturing it
"
CGI.unescapeHTML s
# => "I am seeing more and more <a href=\"http://github.com/aslakhellesoy/cucumber/tree/master\">Cucumber</a> usage.\302\240 This is a good thing!\302\240 But I'm..."