问题标签 [pdf2htmlex]
For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.
cairo - pdf2htmlEX - CairoFontEngine.cc error during ./dobuild
I'm trying to build a docker file which uses pdf2htmlEX-0.18.7-poppler-0.81.0 but I keep getting CairoFontEngine Errors, here's my dockerfile:
When I build the docker image I get the following errors:
/tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc: In member function 'virtual bool CairoFont::matches(Ref&, bool)': /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:83:17: error: no match for 'operator==' (operand types are 'Ref' and 'Ref') return (other == ref); ~~~~~~^~~~~~ /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc: In static member function 'static CairoFreeTypeFont* CairoFreeTypeFont::create(GfxFont*, XRef*, FT_Library, bool)': /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:420:47: error: 'class GooString' has no member named 'c_str' gfxFont->getName() ? gfxFont->getName()->c_str() ^~~~~ /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:439:27: error: 'class GooString' has no member named 'c_str' fileNameC = fileName->c_str(); ^~~~~ /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:488:42: error: invalid conversion from 'const char*' to 'char*' [-fpermissive] ff = FoFiTrueType::load(fileNameC); ^ In file included from /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:45:0: /usr/include/poppler/fofi/FoFiTrueType.h:53:24: note: initializing argument 1 of 'static FoFiTrueType* FoFiTrueType::load(char*, int)' static FoFiTrueType load(char fileName, int faceIndexA=0); ^~~~ /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:502:40: error: invalid conversion from 'const char' to 'char' [-fpermissive] ff = FoFiTrueType::load(fileNameC); ^ In file included from /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:45:0: /usr/include/poppler/fofi/FoFiTrueType.h:53:24: note: initializing argument 1 of 'static FoFiTrueType* FoFiTrueType::load(char*, int)' static FoFiTrueType load(char fileName, int faceIndexA=0); ^~~~ /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:531:42: error: invalid conversion from 'const char' to 'char' [-fpermissive] ff1c = FoFiType1C::load(fileNameC); ^ In file included from /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:46:0: /usr/include/poppler/fofi/FoFiType1C.h:154:22: note: initializing argument 1 of 'static FoFiType1C* FoFiType1C::load(char*)' static FoFiType1C load(char fileName); ^~~~ /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:563:37: error: invalid conversion from 'const char' to 'char' [-fpermissive] ff = FoFiTrueType::load(fileNameC); ^ In file included from /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:45:0: /usr/include/poppler/fofi/FoFiTrueType.h:53:24: note: initializing argument 1 of 'static FoFiTrueType* FoFiTrueType::load(char*, int)' static FoFiTrueType load(char fileName, int faceIndexA=0); ^~~~ /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc: In function 'cairo_status_t _render_type3_glyph(cairo_scaled_font_t, long unsigned int, cairo_t, cairo_text_extents_t*)': /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:700:37: error: no matching function for call to 'Dict::getVal(long unsigned int&)' charProc = charProcs->getVal(glyph); ^ In file included from /usr/include/poppler/Object.h:314:0, from /usr/include/poppler/GfxFont.h:41, from /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.h:38, from /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:42: /usr/include/poppler/Dict.h:85:11: note: candidate: Object* Dict::getVal(int, Object*) Object *getVal(int i, Object *obj); ^~~~~~ /usr/include/poppler/Dict.h:85:11: note: candidate expects 2 arguments, 1 provided /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc: In member function 'virtual bool CairoType3Font::matches(Ref&, bool)': /tmp/pdf2htmlEX-0.18.7-poppler-0.81.0/3rdparty/poppler/git/CairoFontEngine.cc:786:17: error: no match for 'operator==' (operand types are 'Ref' and 'Ref') return (other == ref && printing == printingA); ~~~~~~^~~~~~ CMakeFiles/pdf2htmlEX.dir/build.make:62: recipe for target 'CMakeFiles/pdf2htmlEX.dir/3rdparty/poppler/git/CairoFontEngine.cc.o' failed make[2]: *** [CMakeFiles/pdf2htmlEX.dir/3rdparty/poppler/git/CairoFontEngine.cc.o] Error 1 CMakeFiles/Makefile2:355: recipe for target 'CMakeFiles/pdf2htmlEX.dir/all' failed make[1]: *** [CMakeFiles/pdf2htmlEX.dir/all] Error 2 Makefile:138: recipe for target 'all' failed make: *** [all] Error 2
Please advise how can I resolve this.
pdf2htmlex - 转换过程中的 pdf2htmlEX 错误 - CMap 无效并因字体而被删除
我正在使用这个版本https://github.com/pdf2htmlEX/pdf2htmlEX/releases/tag/v0.18.8.rc1
这个debian版本pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-focal-x86_64.deb
当我运行转换时,我得到了一堆这些错误:
Working: 97/100ToUnicode CMap is not valid and got dropped for font: b7
这导致空文件,没有任何文本。
我正在通过 docker 运行,这是我的 dockerfile:
请告知我该如何解决这个问题?
node.js - pdf2htmlEX 的推荐替代品
我现在使用 pdf2htmlEX 有一段时间了,在多次升级后,我决定寻找替代品。
当前工具
https://github.com/pdf2htmlEX/pdf2htmlEX
认为值得一提的是,我在 Node 上运行并将 pdf2htmlEX 作为子进程生成。
我们在使用此工具时遇到的一些问题是:
- 一些 pdf 字体丢失,而是
[]
出现,这迫使我使用该页面中的图像作为后备。 - 新的 pdf 文件无法转换并出现错误,
pdftotext
其中使用的工具poppler
是 pdf2htmlEX 的一部分 - 文本有时包括在复制粘贴用例期间复制的其他字符
是否在网上进行了一些研究,但无法确定哪个工具更适合给我提供与 pdf2htmlEX 相同(甚至更好的结果)的结果?
请指教
pdf - 使用 poppler 生成的 XML 中的坐标来构建电子邮件模板
尽管 DPI 为 72,但为了能够将 XML 中的坐标转换为像素,必须使用此表反复调整 DPI 。90.5 似乎运作良好。但是,这看起来不像是正确的方法。
生成 XML 的命令:
pdftohtml -xml -zoom 1 -fontfullname -s -c input.pdf output
生成图像的命令:
pdftoppm -jpeg -r 72 input.pdf output
注意:生成图像时使用了 72 dpi,因为在 72 dpi 中输出的图像与 PDF 和 XML 输出的尺寸相似。
这种转换是必不可少的,因为这将允许构建 HTML。我知道 poppler 本身可以生成 HTML,但是,由于生成的 HTML 需要与电子邮件兼容,因此 XML 被用于从头开始构建 HTML。
XML 到 PDF 中坐标的转换可以通过哪些方式更可靠地完成?
pdf2htmlex - pdf2htmlEX - 打开后备选项的转换不起作用
我正在使用pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-focal-x86_64.deb
并且我尝试在打开后备选项的情况下运行该工具,但每次它都会导致空白页面。
尝试了各种参数配置,但每次我得到相同的结果。
请告知使用哪些参数可以在打开后备选项的情况下运行该工具。