我在 TOMCAT 7.0 上运行 UIMA 应用程序时遇到异常。
脚步
1)创建了一个动态的Web项目
2) 新增“添加 UIMA NATURE”
3)创建原始分析引擎(AE)。每个原始 AE 都有一个注释类型和一个注释器(以此处为例)
来自链接的片段
邮编.xml
<?xml version="1.0" encoding="UTF-8"?>
<typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
<name>ZipCode</name>
<description>Defines the zipcode type</description>
<version>1.0</version>
<vendor>MyCompany, Inc.</vendor>
<types>
<typeDescription>
<name>com.mycompany.myapp.uima.annotators.zipcode.ZipCodeAnnotation</name>
<description>ZipCode</description>
<supertypeName>uima.tcas.Annotation</supertypeName>
</typeDescription>
</types>
</typeSystemDescription>
邮编AE.xml
<?xml version="1.0" encoding="UTF-8" ?>
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
<primitive>true</primitive>
<annotatorImplementationName>
com.mycompany.myapp.uima.annotators.zipcode.ZipCodeAnnotator
</annotatorImplementationName>
<analysisEngineMetaData>
<name>Zip Code Annotator</name>
<description>Recognize and annotate zip code in text</description>
<version>1.0</version>
<vendor>MyCompany, Inc.</vendor>
<configurationParameters></configurationParameters>
<configurationParameterSettings></configurationParameterSettings>
<typeSystemDescription>
<imports>
<import location="ZipCode.xml"/>
</imports>
</typeSystemDescription>
<typePriorities></typePriorities>
<fsIndexCollection></fsIndexCollection>
<capabilities>
<capability>
<inputs></inputs>
<outputs>
<type>com.mycompany.myapp.uima.annotators.zipcode.ZipCode</type>
</outputs>
<languagesSupported></languagesSupported>
</capability>
</capabilities>
<operationalProperties>
<modifiesCas>true</modifiesCas>
<multipleDeploymentAllowed>true</multipleDeploymentAllowed>
<outputsNewCASes>false</outputsNewCASes>
</operationalProperties>
</analysisEngineMetaData>
<externalResourceDependencies></externalResourceDependencies>
<resourceManagerConfiguration></resourceManagerConfiguration>
</analysisEngineDescription>
ZipCodeAnnotator.java
public class ZipCodeAnnotator extends JCasAnnotator_ImplBase {
private Pattern zipCodePattern = Pattern.compile("\\d{5}(-\\d{4})*");
@Override
public void process(JCas jCAS) throws AnalysisEngineProcessException {
String text = jCAS.getDocumentText();
Matcher matcher = zipCodePattern.matcher(text);
int pos = 0;
while (matcher.find(pos)) {
ZipCodeAnnotation annotation = new ZipCodeAnnotation(jCAS);
annotation.setBegin(matcher.start());
annotation.setEnd(matcher.end());
annotation.addToIndexes();
pos = matcher.end();
}
}
}
测试用例
public class AddressAnnotatorTest {
private final String[] TEST_STRINGS = new String[] {
"Dr Goldwater, University of Michigan, Ann Arbor, MI 01234",
"Microsoft, 1 Microsoft Way, Redmond, WA",
"Apple, 1 Infinite Loop, Cupertino, CA 95014",
"IBM, 1 New Orchard Road, Armonk, NY 10504",
"Google, 1600 Amphitheater Parkway, Mountain View, CA 94043",
"Healthline, 600 3rd Street, San Francisco, CA 94107",
"Jane Doe, Lake Tahoe, California",
"Miss Liberty, Empire State Building, New York, NY"
};
@Test
public void testAddressAE() throws Exception {
AnalysisEngine ae = TestUtils.getAE(
"src/main/java/com/mycompany/myapp/uima/annotators/aggregates/AddressAE.xml",
null);
for (String text : TEST_STRINGS) {
JCas jcas = TestUtils.runAE(ae, text);
TestUtils.printResults(jcas);
}
}
}
如果我通过 JUnit 测试用例(如上面的代码中给出的)测试这个应用程序,它运行成功,没有任何异常,但是当我通过创建一个带有按钮的简单 xhtml 页面在 Tomcat 上运行这个应用程序时。并且单击按钮时,它会引发异常。
这是代码片段
XHTML
<p:commandButton id="create" value="Print Result"
actionListener="#{demoClass.testZipAE}" ajax="true">
</p:commandButton>
DemoClass.java
@ManagedBean(name = "demoClass")
@SessionScoped()
public class DemoClass implements Serializable {
public AnalysisEngine getAE(
String descriptor, Map<String,Object> params)
throws IOException, InvalidXMLException,
ResourceInitializationException {
AnalysisEngine ae = null;
try {
XMLInputSource in = new XMLInputSource(ClassLoader.getSystemResourceAsStream(descriptor), null);
AnalysisEngineDescription desc =
UIMAFramework.getXMLParser().
parseAnalysisEngineDescription(in);
if (params != null) {
for (String key : params.keySet()) {
desc.getAnalysisEngineMetaData().
getConfigurationParameterSettings().
setParameterValue(key, params.get(key));
}
}
ae = UIMAFramework.produceAnalysisEngine(desc);
} catch (Exception e) {
throw new ResourceInitializationException(e);
}
return ae;
}
public JCas runAE(AnalysisEngine ae, String text)
throws AnalysisEngineProcessException,
ResourceInitializationException {
JCas jcas = ae.newJCas();
jcas.setDocumentText(text);
ProcessTrace trace = ae.process(jcas);
for (ProcessTraceEvent evt : trace.getEvents()) {
if (evt != null && evt.getResultMessage() != null &&
evt.getResultMessage().contains("error")) {
throw new AnalysisEngineProcessException(
new Exception(evt.getResultMessage()));
}
}
return jcas;
}
public void printResults(JCas jcas) {
FSIndex index = jcas.getAnnotationIndex();
for (Iterator<Annotation> it = index.iterator(); it.hasNext(); ) {
Annotation annotation = it.next();
List<Feature> features = new ArrayList<Feature>();
if (annotation.getType().getName().contains("com.mycompany")) {
features = annotation.getType().getFeatures();
}
List<String> fasl = new ArrayList<String>();
for (Feature feature : features) {
if (feature.getName().contains("com.mycompany")) {
String name = feature.getShortName();
String value = annotation.getStringValue(feature);
fasl.add(name + "=\"" + value + "\"");
}
}
System.out.println(
annotation.getType().getShortName() + ": " +
annotation.getCoveredText() + " " +
(fasl.size() > 0 ? StringUtils.join(fasl.iterator(), ",") : "") + " " +
annotation.getBegin() + ":" + annotation.getEnd());
}
System.out.println("==");
}
public void testZipAE(ActionEvent event) throws AnalysisEngineProcessException, ResourceInitializationException {
AnalysisEngine ae;
try {
ae = getAE(
"TestAE.xml",
null);
for (String text : TEST_STRINGS) {
JCas jcas = runAE(ae, text);
printResults(jcas);
}
} catch (InvalidXMLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ResourceInitializationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
测试AE.xml
<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
<primitive>false</primitive>
<delegateAnalysisEngineSpecifiers>
<delegateAnalysisEngine key="ZipCodeAE">
<import location="ZipCodeAE.xml"/>
</delegateAnalysisEngine>
</delegateAnalysisEngineSpecifiers>
<analysisEngineMetaData>
<name>TestAE</name>
<description>Runs the delegate AEs together</description>
<version>1.0</version>
<configurationParameters searchStrategy="language_fallback"/>
<configurationParameterSettings/>
<flowConstraints>
<fixedFlow>
<node>ZipCodeAE</node>
</fixedFlow>
</flowConstraints>
<typePriorities/>
<fsIndexCollection/>
<capabilities>
<capability>
<inputs/>
<outputs/>
<languagesSupported/>
</capability>
</capabilities>
<operationalProperties>
<modifiesCas>true</modifiesCas>
<multipleDeploymentAllowed>true</multipleDeploymentAllowed>
<outputsNewCASes>false</outputsNewCASes>
</operationalProperties>
</analysisEngineMetaData>
<resourceManagerConfiguration/>
</analysisEngineDescription>
例外
org.apache.uima.resource.ResourceInitializationException: Invalid descriptor at <unknown source>.
at com.acn.hps.alpes.gi.cip.jsf.AnalyzeFile.getAE(AnalyzeFile.java:208)
at com.acn.hps.alpes.gi.cip.jsf.AnalyzeFile.testAddressAE(AnalyzeFile.java:145)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.el.parser.AstValue.invoke(AstValue.java:278)
at org.apache.el.MethodExpressionImpl.invoke(MethodExpressionImpl.java:274)
at com.sun.faces.facelets.el.TagMethodExpression.invoke(TagMethodExpression.java:105)
at javax.faces.component.MethodBindingMethodExpressionAdapter.invoke(MethodBindingMethodExpressionAdapter.java:88)
at com.sun.faces.application.ActionListenerImpl.processAction(ActionListenerImpl.java:101)
at javax.faces.component.UICommand.broadcast(UICommand.java:315)
at javax.faces.component.UIViewRoot.broadcastEvents(UIViewRoot.java:794)
at javax.faces.component.UIViewRoot.processApplication(UIViewRoot.java:1259)
at com.sun.faces.lifecycle.InvokeApplicationPhase.execute(InvokeApplicationPhase.java:81)
at com.sun.faces.lifecycle.Phase.doPhase(Phase.java:101)
at com.sun.faces.lifecycle.LifecycleImpl.execute(LifecycleImpl.java:118)
at javax.faces.webapp.FacesServlet.service(FacesServlet.java:593)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at org.primefaces.webapp.filter.FileUploadFilter.doFilter(FileUploadFilter.java:77)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:929)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1002)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:585)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.uima.util.InvalidXMLException: Invalid descriptor at <unknown source>.
at org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:194)
at org.apache.uima.util.impl.XMLParser_impl.parseAnalysisEngineDescription(XMLParser_impl.java:492)
at org.apache.uima.util.impl.XMLParser_impl.parseAnalysisEngineDescription(XMLParser_impl.java:473)
at com.acn.hps.alpes.gi.cip.jsf.AnalyzeFile.getAE(AnalyzeFile.java:198)
... 36 more
Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog.
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:177)
... 39 more
我最初认为,由于 XML 中的任何无效字符,异常被抛出,然后按照此链接(XML - Data At Root Level is Invalid)并确保 xml 编码时没有 BOM(字节顺序标记)。不幸的是,这没有帮助。
任何想法如何摆脱这个异常。