为了防止 html 代码注入和跨站点脚本,为服务请求构建了一个过滤器,以使用以下方法转义某些字符:StringEscapeUtils.escapeHtml(text)
但是,这也会转义一些 UTF8 字符,例如 äöü。在调用“StringEscapeUtils.escapeHtml”之前使用 excludeList 并将这些值转换为它们的哈希码,并在此调用之后从哈希值转换回字符串,可以解决问题。但这不是一个非常优雅的解决方案!
String[] excludeList = {"ü", "Ü", "ö", "Ö", "ä", "Ä", "ß"};
private static String escapeHtml(String text, String[] exclusionList) {
TreeMap<Integer, String> excludeTempMap = new TreeMap<Integer, String>();
//replace characters from exclusionList in the text with their equivalent hashCode
for(String excludePart : exclusionList) {
Matcher matcher = Pattern.compile(excludePart, Pattern.MULTILINE).matcher(text);
while(matcher.find()) {
String match = matcher.group();
Integer matchHash = match.hashCode();
text = matcher.replaceFirst(String.valueOf(matchHash));
excludeTempMap.put(matchHash, match);
matcher.reset(text);
}
}
//escape malicious html characters
text = StringEscapeUtils.escapeHtml(text);
//replace back characters from exclusionList from hash values to string
for(Map.Entry<Integer, String> excludeEntry : excludeTempMap.entrySet()) {
text = text.replaceAll(
String.valueOf(excludeEntry.getKey()),
excludeEntry.getValue()
);
}
return text;
}
有人有提示如何通过更好的解决方案实现这一目标吗?他们是一个更好的库,可用于将某些语言特定字符列入白名单吗?