asp.net - 如何使用 AntiXss 库正确清理内容？

Question

我有一个简单的论坛应用程序，当有人发布任何内容时，我会：

post.Content = Sanitizer.GetSafeHtml(post.Content);

现在，我不确定我是否做错了什么，或者发生了什么，但它不允许几乎没有 html。即使是简单<b></b>的也太过分了。所以我猜这个工具完全没用。

现在我的问题：谁能告诉我应该如何清理我的用户输入，以便他们可以发布一些图像（<img>标签）并使用粗体强调等？

score 6 · Accepted Answer

似乎很多人觉得消毒剂没什么用。不要使用消毒剂，只需对所有内容进行编码，然后将安全部分解码回来：

private static readonly IEnumerable<string> WhitelistedTags =
    new[] { "<b>", "</b>", "<i>", "</i>" };

private static readonly (string Encoded, string Decoded)[] DecodingPairs =
    WhitelistedTags
    .Select(tag => (Microsoft.Security.Application.Encoder.HtmlEncode(tag), tag))
    .ToArray();

public static string Sanitize(string html)
{
    // Encode the whole thing
    var safeHtml = Microsoft.Security.Application.Encoder.HtmlEncode(html);
    var builder = new StringBuilder(safeHtml);

    // Decode the safe parts
    foreach (var (encodedTag, decodedTag) in DecodingPairs)
    {
        builder.Replace(encodedTag, decodedTag);
    }

    return builder.ToString();
}

请注意，安全地解码IMG标签几乎是不可能的，因为攻击者滥用此标签的方法非常简单。例子：

<IMG SRC="javascript:alert('XSS');">

<IMG SRC=&#106;&#97;&#118;&#97;&#115;&#99;&#114;&#105;&#112;&#116;&#58;&#97;&#108;&#101;&#114;&#116;&#40;&#39;&#88;&#83;&#83;&#39;&#41;>

在这里查看更详尽的XSS 备忘单

score 1 · Accepted Answer

这篇文章最好地描述了 Anti XSS 库的问题，并提供了一个很好的解决方法，将一组标签和属性列入白名单。

我在我的项目中使用这个解决方案，它似乎工作得很好。

score -1 · Accepted Answer

有一种非常简单的方法可以通过去掉“危险”标签来阻止威胁。

string SanitizeHtml(string html)
{
        html = System.Web.HttpUtility.HtmlDecode(html);

        List<string> blackListedTags = new List<string>() 
        {
                "body", "script", "iframe", "form", "object", "embed", "link", "head", "meta" 
        };

        foreach (string tag in blackListedTags) { 
            html = Regex.Replace(html, "<" + tag, "<p", RegexOptions.IgnoreCase); 
            html = Regex.Replace(html, "</" + tag, "</p", RegexOptions.IgnoreCase);
        }

        return html;
}

有了这个，用户仍然可以看到危险脚本中的内容，但它不会伤害任何东西。

asp.net - 如何使用 AntiXss 库正确清理内容？

3 回答 3

Related

Reference