以下代码用于将 .doc 文件转换为 HTML 文件。但它似乎只适用于包含“普通”文本的文档文件。当尝试将包含文本属性的 .doc 转换为粗体并加下划线时,它会给我错误消息。如何解决这个问题?
`public class ConvertDoc {
public void createHTML() throws Exception {
try {
File sdcard = Environment.getExternalStorageDirectory();
File file = new File(sdcard, "/MyFolder/myfile.doc");
HWPFDocumentCore document = WordToHtmlUtils.loadDoc(new FileInputStream(file));
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
wordToHtmlConverter.processDocument(document);
Document htmlDocument = wordToHtmlConverter.getDocument();
ByteArrayOutputStream out = new ByteArrayOutputStream();
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(out);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
out.close();
String html = new String(out.toByteArray());
BufferedWriter outstream = new BufferedWriter
(new OutputStreamWriter(new FileOutputStream(sdcard + "/MyFolder/myfile.html"), "UTF-8"));
outstream.write(html);
outstream.close();
} catch (Throwable e) {
e.printStackTrace();
}try`
错误日志:
10-24 04:57:32.651: E/dalvikvm(6011): Could not find class 'java.rmi.UnexpectedException', referenced from method org.apache.poi.hpsf.PropertySetFactory.create
10-24 04:57:32.651: W/dalvikvm(6011): VFY: unable to resolve new-instance 384 (Ljava/rmi/UnexpectedException;) in Lorg/apache/poi/hpsf/PropertySetFactory;
10-24 04:57:32.651: D/dalvikvm(6011): VFY: replacing opcode 0x22 at 0x0020
10-24 04:57:32.661: D/dalvikvm(6011): DexOpt: unable to opt direct call 0x0848 at 0x26 in Lorg/apache/poi/hpsf/PropertySetFactory;.create
10-24 04:57:32.801: W/System.err(6011): java.lang.NullPointerException
10-24 04:57:32.811: W/System.err(6011): at org.apache.poi.hwpf.converter.AbstractWordUtils.compactChildNodesR(AbstractWordUtils.java:146)
10-24 04:57:32.811: W/System.err(6011): at org.apache.poi.hwpf.converter.WordToHtmlUtils.compactSpans(WordToHtmlUtils.java:238)
10-24 04:57:32.821: W/System.err(6011): at org.apache.poi.hwpf.converter.WordToHtmlConverter.processParagraph(WordToHtmlConverter.java:596)
10-24 04:57:32.821: W/System.err(6011): at org.apache.poi.hwpf.converter.AbstractWordConverter.processParagraphes(AbstractWordConverter.java:1113)
10-24 04:57:32.832: W/System.err(6011): at org.apache.poi.hwpf.converter.WordToHtmlConverter.processSingleSection(WordToHtmlConverter.java:617)
10-24 04:57:32.832: W/System.err(6011): at org.apache.poi.hwpf.converter.AbstractWordConverter.processDocument(AbstractWordConverter.java:722)