0

将任何 Office 2007 文档(例如 pptx、docx、xslx)上传到 Sling 时,我收到以下多个错误(见下文 - 每个文件一个)。我正在使用Sling 6 stable 独立版。

还有其他人遇到这种情况吗?tika 捆绑包有任何已知问题吗?

谢谢

23.01.2013 14:32:27.248 *WARN* [jackrabbit-pool-1] org.apache.jackrabbit.core.query.lucene.LazyTextExtractorField Failed to extract text from a binary property org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@5217e8de
                at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:122)
                at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:101)
                at org.apache.jackrabbit.core.query.lucene.JackrabbitParser.parse(JackrabbitParser.java:189)
                at org.apache.jackrabbit.core.query.lucene.LazyTextExtractorField$ParsingTask.run(LazyTextExtractorField.java:174)
                at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
                at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
                at java.util.concurrent.FutureTask.run(FutureTask.java:138)
                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
                at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException
                at org.apache.poi.xwpf.usermodel.XWPFFactory.createDocumentPart(XWPFFactory.java:60)
                at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:256)
                at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:196)
                at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:94)
                at org.apache.poi.xwpf.extractor.XWPFWordExtractor.<init>(XWPFWordExtractor.java:45)
                at org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory.java:111)
                at org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory.java:86)
                at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:47)
                at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:120)
                ... 11 more
Caused by: java.lang.reflect.InvocationTargetException
                at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
                at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
                at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
                at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
                at org.apache.poi.xwpf.usermodel.XWPFFactory.createDocumentPart(XWPFFactory.java:58)
                ... 19 more
Caused by: java.lang.NoClassDefFoundError: org/openxmlformats/schemas/wordprocessingml/x2006/main/SettingsDocument$Factory
                at org.apache.poi.xwpf.usermodel.XWPFSettings.readFrom(XWPFSettings.java:129)
                at org.apache.poi.xwpf.usermodel.XWPFSettings.<init>(XWPFSettings.java:43)
                ... 24 more
Caused by: java.lang.ClassNotFoundException: org.openxmlformats.schemas.wordprocessingml.x2006.main.SettingsDocument$Factory not found by org.apache.tika.bundle [63]
                at org.apache.felix.framework.ModuleImpl.findClassOrResourceByDelegation(ModuleImpl.java:787)
                at org.apache.felix.framework.ModuleImpl.access$400(ModuleImpl.java:71)
                at org.apache.felix.framework.ModuleImpl$ModuleClassLoader.loadClass(ModuleImpl.java:1768)
                at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
                ... 26 more
4

1 回答 1

0

这是由于 tika 0.6 捆绑包中缺少/不正确的依赖项。

我必须使用以下更改重新编译 tika 0.6 才能正常工作。然后我替换了 sling 独立 jar 文件中的 tika 包。请让我知道是否有更好的方法来做到这一点,因为我是一个 java 初学者。谢谢

对 tika-0.6.tika-parsers.pom.xml 所做的更改:

添加:

<dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>ooxml-schemas</artifactId>
      <version>1.1</version>
    </dependency>
    <dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>poi-ooxml-schemas</artifactId>
      <version>${poi.version}</version>
    </dependency>

删除:

<dependency>
      <groupId>org.apache.geronimo.specs</groupId>
      <artifactId>geronimo-stax-api_1.0_spec</artifactId>
      <version>1.0.1</version>
    </dependency>
于 2013-01-27T11:39:37.830 回答