2

谁能指导我如何创建一个自定义JAPE文件并使用 GATE 源代码对其进行配置。我尝试使用以下代码并获得诸如“解析语法时出错:”和“未设置语法 URL 或 binaryGrammarURL 参数!”之类的异常。

     try{
             Document doc = new DocumentImpl();
              String str = "This is test.";
              DocumentContentImpl impl = new DocumentContentImpl(str);
              doc.setContent(impl);
          System.setProperty("gate.home", "C:\\Program Files\\GATE_Developer_7.1"); 
          Gate.init();
          gate.Corpus corpus = (Corpus) Factory
            .createResource("gate.corpora.CorpusImpl");
          File gateHome = Gate.getGateHome();
          File pluginsHome = new File(gateHome, "plugins");
          Gate.getCreoleRegister().registerDirectories(new File(pluginsHome, "ANNIE").toURI().toURL());  

          Transducer transducer = new Transducer();
             transducer.setDocument(doc);
transducer.setGrammarURL(new URL("file:///D:/misc_workspace/gate-7.1-build4485-SRC/plugins/ANNIE/resources/NE/SportsCategory.jape"));
transducer.setBinaryGrammarURL(new URL("file:///D:/misc_workspace/gate-7.1-build4485-SRC/plugins/ANNIE/resources/NE/SportsCategory.jape"));

LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
                  "gate.creole.Transducer", gate.Utils.featureMap(
                          "grammarURL", "D:/misc_workspace/gate-7.1-build4485-SRC/plugins/ANNIE/resources/NE/SportsCategory.jape",
                          "encoding", "UTF-8"));
4

3 回答 3

3

您需要加载 ANNIE 插件

Gate.getCreoleRegister().registerDirectories(
  new File(Gate.getPluginsHome(), "ANNIE").toURI().toURL());

然后创建一个gate.creole.Transducer具有正确参数的实例

LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
  "gate.creole.Transducer", gate.Utils.featureMap(
      "grammarURL", new URL("file:///D:/path/to/my-grammar.jape"),
      "encoding", "UTF-8")); // ensure this matches the file

但是我们通常提倡的方法是在 GATE Developer 中按照您想要的方式组装和配置整个管道,使用您需要的任何标准组件以及您自己的语法,然后将应用程序状态保存到文件中。然后,您可以使用一行代码从代码中重新加载整个应用程序

CorpusController app = (CorpusController) PersistenceManager.loadObjectFromFile(savedAppFile);

编辑:您添加到问题中的代码有几个基本问​​题。首先,您必须Gate.init()在使用 GATE 执行任何其他操作之前调用 - 它必须您创建Document. 其次,您绝不能直接调用类的构造函数Resource- 始终使用Factory. 同样,您永远不需要init()直接调用,因为这是作为Factory.createResource. 例如:

// initialise GATE
Gate.setGateHome(new File("C:\\Program Files\\GATE_Developer_7.1"));
Gate.init();

// load ANNIE plugin - you must do this before you can create tokeniser
// or JAPE transducer resources.
Gate.getCreoleRegister().registerDirectories(
   new File(Gate.getPluginsHome(), "ANNIE").toURI().toURL());

// Build the pipeline
SerialAnalyserController pipeline =
  (SerialAnalyserController)Factory.createResource(
     "gate.creole.SerialAnalyserController");
LanguageAnalyser tokeniser = (LanguageAnalyser)Factory.createResource(
     "gate.creole.tokeniser.DefaultTokeniser");
LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
  "gate.creole.Transducer", gate.Utils.featureMap(
      "grammarURL", new File("D:\\path\\to\\my-grammar.jape").toURI().toURL(),
      "encoding", "UTF-8")); // ensure this matches the file
pipeline.add(tokeniser);
pipeline.add(jape);

// create document and corpus
Corpus corpus = Factory.newCorpus(null);
Document doc = Factory.newDocument("This is test.");
corpus.add(doc);
pipeline.setCorpus(corpus);

// run it
pipeline.execute();

// extract results
System.out.println("Found annotations of the following types: " +
  doc.getAnnotations().getAllTypes());

如果您还没有,我强烈建议您完成至少模块 5 的培训课程材料,这将向您展示加载文档并在其上运行处理资源的正确方法。

于 2013-02-23T18:10:23.153 回答
1

谢谢伊恩。这些培训课程材料很有帮助。但我的问题不同,我已经解决了。以下代码快照是如何在 GATE 中使用自定义 jape 文件。现在我的自定义 jape 文件能够生成新的注释

 System.setProperty("gate.home", "C:\\Program Files\\GATE_Developer_7.1"); 
  Gate.init();

  ProcessingResource token = (ProcessingResource)   Factory.createResource("gate.creole.tokeniser.DefaultTokeniser",Factory.newFeatureMap());



 String str = "This is a test. Myself Abhijit Nag sport";
   Document doc = Factory.newDocument(str);


  gate.Corpus corpus = (Corpus) Factory.createResource("gate.corpora.CorpusImpl");
  corpus.add(doc);
  File gateHome = Gate.getGateHome();
  File pluginsHome = new File(gateHome, "plugins");

  Gate.getCreoleRegister().registerDirectories(new File(pluginsHome, "ANNIE").toURI().toURL());  


 LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
              "gate.creole.Transducer", gate.Utils.featureMap(
                      "grammarURL", "file:///D:/misc_workspace/gate-7.1-build4485-SRC/plugins/ANNIE/resources/NE/SportsCategory.jape","encoding", "UTF-8"));
      jape.setCorpus(corpus);
      jape.setDocument(doc);
      jape.execute();

  pipeline = (SerialAnalyserController) Factory.createResource("gate.creole.SerialAnalyserController",
                Factory.newFeatureMap(), Factory.newFeatureMap(),"ANNIE");
              initAnnie();
              pipeline.setCorpus(corpus);
              pipeline.add(token);
              pipeline.add((ProcessingResource)jape.init());
              pipeline.execute();
      AnnotationSetImpl ann = (AnnotationSetImpl) doc.getAnnotations();
      System.out.println(" ...Total annotation "+ann.getAllTypes());
于 2013-02-25T20:29:10.797 回答
0

如果您想更新 ANNIE 管道,这是另一种选择。

  1. 首先获取管道中默认/现有处理资源的列表
  2. 创建 JAPE 规则的实例
  3. 迭代现有处理资源的列表,将每个处理资源添加到新集合中。将您自己的自定义 JAPE 规则添加到此集合中。
  4. 当您执行 ANNIE 管道时,您的 JAPE 规则将自动被拾取,因此无需指定文档路径或单独执行。

示例代码:

File pluginsHome = Gate.getPluginsHome();
File anniePlugin = new File(pluginsHome, "ANNIE");
File annieGapp = new File(anniePlugin, "ANNIE_with_defaults.gapp");
annieController = (CorpusController) PersistenceManager.loadObjectFromFile(annieGapp);

LanguageAnalyser jape = (LanguageAnalyser)Factory.createResource(
                "gate.creole.Transducer", gate.Utils.featureMap(
                        "grammarURL", new URL("file:///C://Program Files//gate-7.1//plugins//ANNIE//resources//NE//opensource.jape"),
                        "encoding", "UTF-8")); 

Collection<ProcessingResource> newPRS = new ArrayList<ProcessingResource>();
Collection<ProcessingResource> prs = annieController.getPRs();
for(ProcessingResource resource: prs){
    newPRS.add(resource);
}
newPRS.add((ProcessingResource)jape.init());
annieController.setPRs(newPRS);
于 2014-02-15T14:56:13.520 回答