I am using MarkLogic to generate XML files for PDF documents which has images, formatted text (italic and bold), tables etc. Can you please provide some guidelines for the best conversion. I am using normal conversion with following pipelines:

  • Conversion Processing
  • DocBook Conversion
  • HTML Conversion
  • PDF Conversion
  • PDF Conversion (Page Layout, Image Batching)
  • Status Change Handling

The images are not maintained with their title and format also not maintained. Tables are appearing as normal paragraph in the generated XML.


1 回答 1


文档转换的一部分是构建CSS文件来处理格式,以及抓取文档中的图像。两者都进入数据库。当您使用浏览器查看文档时,请确保指向图像和CSS的链接有效。您可能需要将它们从例如更改/doc1.css/get.xqy?uri=doc1.css. 此外,页面上的其他CSS可能会干扰文档的CSS

于 2012-04-27T13:59:59.863 回答