0

相关代码;关于实例化的barfs SAXSource

TransformerFactory factory = TransformerFactory.newInstance();
XMLReader xmlReader = XMLReaderFactory.createXMLReader("org.ccil.cowan.tagsoup.Parser");
Source input = new SAXSource(xmlReader, "http://books.toscrape.com/");
Result output = new StreamResult(System.out);
factory.newTransformer().transform(input, output);

JavaDoc

public SAXSource(XMLReader reader,
         InputSource inputSource)

使用 XMLReader 和 SAX InputSource 创建一个 SAXSource。Transformer 或 SAXTransformerFactory 会将自己设置为阅读器的 ContentHandler,然后调用 reader.parse(inputSource)。

InputSource节目:

InputSource(InputStream byteStream)
Create a new input source with a byte stream.
InputSource(Reader characterStream)
Create a new input source with a character stream.

因此,例如html,这将需要一个字符流来读取InputStream??

tagsoup更好地用于这种身份转换吗?但是,怎么做?

4

2 回答 2

3

There is a constructor https://docs.oracle.com/javase/8/docs/api/org/xml/sax/InputSource.html#InputSource-java.lang.String- that takes a system id e.g. a URL so you can use Source input = new SAXSource(xmlReader, new InputSource("http://books.toscrape.com/"));.

于 2019-01-04T15:46:11.623 回答
1

您可以访问从 URL 后面的资源读取的 InputStream,如下所示:

InputStream i = new URL("http://...").openConnection().getInputStream();

然后你可以使用i你的SAXSource.

于 2019-01-04T15:43:15.233 回答