1

我收到此错误错误:edu.stanford.nlp.ling.tokensregex.parser.TokenMgrError:Encountered: "\'", after : ""

我正在使用 stanford core nlp 的最新 2016-10-31 版本,这是我的代码

  static MaxentTagger tagger = new MaxentTagger("C:/Users/Sam/Desktop/stanford-corenlp-full-2016-10-31/stanford-corenlp-full-2016-10-31/edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger"); 

  tokens = new ArrayList<CoreLabel>();
  properties = new Properties();
  properties.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
  pipeline = new StanfordCoreNLP(properties);

  this.sentenceFeatures = Main.pipeline.process(textFeatures)
            .get(CoreAnnotations.SentencesAnnotation.class);
  for(CoreMap sentence: this.sentenceFeatures) {
        // **using TokensRegex**
        for (CoreLabel token: sentence.get(TokensAnnotation.class)){ 
            Main.tokens.add(token);            
        }           
        this.p1 = TokenSequencePattern.compile(this.textFeatures);
        this.matcher = p1.getMatcher(Main.tokens);
        for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
            String words = token.get(TextAnnotation.class);
            this.wordsList.add(words);
            String tagged = Main.tagger.tagString(words);
            this.wordsList.add(tagged);
            this.no_of_words++;
            this.no_of_chars += words.length();
        }
  }

尽管

  textFeatures = "At the western corner of the northern peninsula of Michigan you'll find Porcupine Mountains Wilderness State Park that is a true gem for a great family vacation , and the largest State Park in Michigan too.It has many things to offer to families looking for adventure in the great outdoors.And it's accessible during all the seasons of the year.The Park got its name from the Ojibwa Indians, who named the 'small mountains' thus, because, with their tall trees, they looked like crouching porcupines.Visit in the spring as the leaves begin to bud.Call on it in the summer lambing season.Greet it in the Autumn as the leaves begin to fall.Love it in the winter skying season.Its terrain is wild and rugged where it sits at the shoreline of Lake Superior, and inland where it embraces sixty-thousand acres of towering pines and sturdy birches and natural hemlock forests.The rivers teem with various species of trout as they cut through the forests."

我得到的例外是:

Exception in thread "main" java.lang.RuntimeException: When parsing Here are some of Michigan's premier attractions (although of course it's by no means complete) - or intended to be, for that matter.Porcupine Mountains 
At the western corner of the northern peninsula of Michigan you'll find Porcupine Mountains Wilderness State Park that is a true gem for a great family vacation , and the largest State Park in Michigan too.It has many things to offer to families looking for adventure in the great outdoors.And it's accessible during all the seasons of the year.The Park got its name from the Ojibwa Indians, who named the 'small mountains' thus, because, with their tall trees, they looked like crouching porcupines.Visit in the spring as the leaves begin to bud.Call on it in the summer lambing season.Greet it in the Autumn as the leaves begin to fall.Love it in the winter skying season.Its terrain is wild and rugged where it sits at the shoreline of Lake Superior, and inland where it embraces sixty-thousand acres of towering pines and sturdy birches and natural hemlock forests.The rivers teem with various species of trout as they cut through the forests.       edu.stanford.nlp.ling.tokensregex.parser.TokenSequenceParseException: Parsing failed. Error: edu.stanford.nlp.ling.tokensregex.parser.TokenMgrError: Lexical error at line 1, column 26.  Encountered: "\'" (39), after : ""
at edu.stanford.nlp.ling.tokensregex.TokenSequencePattern.compile(TokenSequencePattern.java:192)
at edu.stanford.nlp.ling.tokensregex.TokenSequencePattern.compile(TokenSequencePattern.java:171)
4

1 回答 1

2

TokenSequencePattern 应该是您要编译的规则,而不是您要在其中查找模式的文本。

这里也是对 TokensRegex 的一般参考:

https://nlp.stanford.edu/software/tokensregex.html

于 2017-04-02T00:50:17.533 回答