在学习 Gate 的过程中,遇到了如下问题:
Minipar 在看到 Ö、Ü、Ä 等非注释字符时抛出异常。
例如,在句子“Batten 病(也称为 Spielmeyer-Vogt-Sjögren-Batten 病)是一种罕见的、致命的常染色体隐性遗传神经退行性疾病,始于儿童期。” (来自 wiki 文章) Minipar 在停止工作之前得到的注释是“Batten disease(也称为 Spielmeyer-Vogt-Sj”,正好在字符ö之前,所以这让我猜测这是一个在使用时值得关注的案例Gate. 因为同一个流水线轻而易举地处理了其他几篇文章。
在消息选项卡中,它报告:
gate.util.InvalidOffsetException
at gate.annotation.AnnotationSetImpl.getNodes(AnnotationSetImpl.java:773)
at gate.annotation.AnnotationSetImpl.add(AnnotationSetImpl.java:802)
at minipar.Minipar.runMinipar(Minipar.java:419)
at minipar.Minipar.execute(Minipar.java:527)
at gate.util.Benchmark.executeWithBenchmarking(Benchmark.java:291)
at gate.creole.ConditionalSerialController.runComponent(ConditionalSerialController.java:154)
at gate.creole.SerialController.executeImpl(SerialController.java:153)
at gate.creole.ConditionalSerialAnalyserController.executeImpl(ConditionalSerialAnalyserController.java:129)
at gate.creole.AbstractController.execute(AbstractController.java:75)
at gate.util.Benchmark.executeWithBenchmarking(Benchmark.java:291)
at gate.gui.SerialControllerEditor$RunAction$1.run(SerialControllerEditor.java:1619)
at java.lang.Thread.run(Unknown Source)
gate.creole.ExecutionException: gate.util.InvalidOffsetException
at minipar.Minipar.runMinipar(Minipar.java:491)
at minipar.Minipar.execute(Minipar.java:527)
at gate.util.Benchmark.executeWithBenchmarking(Benchmark.java:291)
at gate.creole.ConditionalSerialController.runComponent(ConditionalSerialController.java:154)
at gate.creole.SerialController.executeImpl(SerialController.java:153)
at gate.creole.ConditionalSerialAnalyserController.executeImpl(ConditionalSerialAnalyserController.java:129)
at gate.creole.AbstractController.execute(AbstractController.java:75)
at gate.util.Benchmark.executeWithBenchmarking(Benchmark.java:291)
at gate.gui.SerialControllerEditor$RunAction$1.run(SerialControllerEditor.java:1619)
at java.lang.Thread.run(Unknown Source)
Caused by: gate.util.InvalidOffsetException
at gate.annotation.AnnotationSetImpl.getNodes(AnnotationSetImpl.java:773)
at gate.annotation.AnnotationSetImpl.add(AnnotationSetImpl.java:802)
at minipar.Minipar.runMinipar(Minipar.java:419)
... 9 more
gate.creole.ExecutionException: Document doesn't have sentence annotations. please run tokenizer, sentence splitter and then Minipar
at minipar.Minipar.saveGateSentences(Minipar.java:194)
at minipar.Minipar.execute(Minipar.java:525)
at gate.util.Benchmark.executeWithBenchmarking(Benchmark.java:291)
at gate.creole.ConditionalSerialController.runComponent(ConditionalSerialController.java:154)
at gate.creole.SerialController.executeImpl(SerialController.java:153)
at gate.creole.ConditionalSerialAnalyserController.executeImpl(ConditionalSerialAnalyserController.java:129)
at gate.creole.AbstractController.execute(AbstractController.java:75)
at gate.util.Benchmark.executeWithBenchmarking(Benchmark.java:291)
at gate.gui.SerialControllerEditor$RunAction$1.run(SerialControllerEditor.java:1619)
at java.lang.Thread.run(Unknown Source)
我要再次感谢 Ian 的热情支持。
马特