我的解决方案是继承 TextNode 并覆盖进行转义的方法。
package org.jsoup.nodes;
public class UnescapedTextNode extends TextNode
{
public UnescapedTextNode( final String text, final String baseUri )
{
super( text, baseUri );
}
@Override
void outerHtmlHead(
final StringBuilder accum,
final int depth,
final Document.OutputSettings out )
{
//String html = Entities.escape( getWholeText(), out ); // Don't escape!
String html = getWholeText();
if ( out.prettyPrint() &&
parent() instanceof Element &&
!Element.preserveWhitespace( parent() ) )
{
html = normaliseWhitespace( html );
}
if ( out.prettyPrint() &&
( ( siblingIndex() == 0 &&
parentNode instanceof Element &&
( (Element)parentNode ).tag().formatAsBlock() &&
!isBlank() ) ||
( out.outline() &&
siblingNodes().size() > 0 &&
!isBlank() ) ) )
{
indent( accum, depth, out );
}
accum.append( html );
}
}
这几乎是TextNode.outerHtmlHead()
(最初由 Jonathan Hedley 撰写)的逐字副本。我刚刚注释掉了转义部分。这就是我使用它的方式:
// ... assuming head is of type Element and refers to the <head> of the document.
final String message = "Hello World!";
final String messageScript = "alert( \"" + message + "\" );";
final Element messageScriptEl = head.appendElement( "script" ).
attr( "type", "text/javascript" );
final TextNode messageScriptTextNode = new UnescapedTextNode(
messageScript,
messageScriptEl.baseUri() );
messageScriptEl.appendChild( messageScriptTextNode );
// ... etc
进一步,调用Document.toString()
或Document.outerHtml()
生成带有未转义创建的脚本标记内的文本的输出。IE:
<script type="text/javascript">alert( "Hello World!" );</script>
代替:
<script type="text/javascript">alert( "Hello World!" );</script>
就像以前发生的那样。
我发现了两个“陷阱”: