我们正在索引数百万份文档。我们使用 Solr 3.1 和 Jetty。我启用了 Jetty 的登录,如下所述:http ://wiki.apache.org/solr/LoggingInDefaultJettySetup
对于某些全文,我们会遇到异常,因此会记录这样的日志:
<record>
<date>2012-09-04T15:55:16</date>
<millis>1346766916578</millis>
<sequence>0</sequence>
<logger>org.apache.solr.core.SolrCore</logger>
<level>SEVERE</level>
<class>org.apache.solr.common.SolrException</class>
<method>log</method>
<thread>10</thread>
<message>java.lang.RuntimeException: [was class java.io.CharConversionException] Invalid UTF-8 character 0xd835(a surrogate character) at c
har #1144, byte #127)
at com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
at com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:287)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:146)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
</message>
</record>
最好也记录发送的文档 ID。我们应该怎么做?
谢谢!