2

试图让 Elasticsearch 运行并为 PDF 编制索引。我对Java不熟悉。它抱怨的窗口服务器是什么,我该如何解决?

Jun 13 15:57:23 server.mydomain.com java[22345] <Error>: kCGErrorFailure: Set a breakpoint @ CGErrorBreakpoint() to catch errors as they are logged.
Exception in thread "elasticsearch[index]-pool-2-thread-1" java.lang.InternalError: Can't connect to window server - not enough permissions.
    at java.lang.ClassLoader$NativeLibrary.load(Native Method)
    at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1827)
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1724)
    at java.lang.Runtime.loadLibrary0(Runtime.java:823)
    at java.lang.System.loadLibrary(System.java:1045)
    at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:50)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.awt.Toolkit.loadLibraries(Toolkit.java:1605)
    at java.awt.Toolkit.<clinit>(Toolkit.java:1627)
    at java.awt.Color.<clinit>(Color.java:263)
    at org.apache.pdfbox.pdmodel.PDPage.<clinit>(PDPage.java:80)
    at org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:212)
    at org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:218)
    at org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:184)
    at org.apache.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:211)
    at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:322)
    at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:63)
    at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:140)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at org.elasticsearch.plugin.mapper.attachments.tika.TikaExtended.parseToString(TikaExtended.java:48)
    at org.elasticsearch.index.mapper.attachment.AttachmentMapper.parse(AttachmentMapper.java:309)
    at org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:585)
    at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:449)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:437)
    at org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:290)
    at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:210)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:532)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:680)
4

2 回答 2

7

为了能够正确索引 PDF,ElasticSearch 要求 Apache Tika 提取文本。Apache Tika 反过来要求 Apache PDFBox 解析文件以进行处理。由于 PDF 的工作方式,PDFBox 需要使用 Java AWT 调用来执行诸如计算字体、颜色等操作

您的机器当前没有正确设置 Java 来执行图形操作,所以当 PDFBox 尝试使用 AWT 来处理 PDF 时,它会崩溃。

你有两个选择。一个是完成图形设置,另一个是告诉 Java 以无头模式运行。

如果您在谷歌上搜索您的错误消息,那么您会发现很多有关如何为任一选项执行适当的 OSX 设置的有用答案。这个看起来是一个很好的例子。与 Java 的大多数 unix 变体一样,如果您使用

java -Djava.awt.headless=true

然后它将使用无头模式,并且不会遇到真实图形系统的权限问题。

于 2012-06-13T23:09:55.663 回答
2

这是您的应用程序尝试在无头环境中使用 AWT 的结果。

要修复它,请使用告诉 AWT 使用无头模式的选项启动您的应用程序。

-Djava.awt.headless=true
于 2012-06-13T23:08:17.377 回答