我需要从任意文件类型中获取缩略图。(或者尽可能多的。)
对于像文件类型这样的图像,我可以使用 image-magick 。对于文件之类的文件,我想我会使用:
文档 ->(使用 Pyuno 打开办公室)PDF ->(Image-magick)Pdf 到图像。-> 第一页的缩略图。
- 有一个更好的方法吗?
- 有没有可以做到这一点的网络服务?
我需要从任意文件类型中获取缩略图。(或者尽可能多的。)
对于像文件类型这样的图像,我可以使用 image-magick 。对于文件之类的文件,我想我会使用:
文档 ->(使用 Pyuno 打开办公室)PDF ->(Image-magick)Pdf 到图像。-> 第一页的缩略图。
Yes, I think you got it right. Of course there are some web services out there though I don't have experience using these so I won't list any here.
Creating a thumbnail of a document requires rendering and office documents like docx are so complex that only very few libraries/applications can render them. LibreOffice seems to be the best bet in that area.
Thankfully there is already a Python script out there which provides a command-line front-end for conversion using LibreOffice/OpenOffice: unoconv. It should be able to use all the export filters present in the office suite (including png and pdf).
I noticed some problems exporting directly to png but pdf exports were mostly fine.
Btw: If you have problems with imagemagick you might want to try ghostscript.