- 一个元素的结束标签已经是下一个元素的开始标签
- 或者:开始标签不是句法元素,但它是当前的解析状态(所以它取决于你在输入流中已经“看到”的内容)
wordWithoutSpace [phonetic information]
definition as everything until colon: example sentence until EOF
示例 2,多定义条目:
wordWithoutSpace [phonetic information]
1. first definition until colon: example sentence until second definition
2. second definition until colon: example sentence until EOF
dictionary-entry :
word = .+ ' ' // catch everything as word until you see a space
phon = '[' .+ ']' // then follows phonetic, which is everything in brackets
(MultipleMeaning | UniqueMeaning)
MultipleMeaning : Int '.' definition= .+ ':' // a MultipleMeaning has a number
// before the definition
UniqueMeaning : definition= .+ ':'
我试过带门的 Lexer(antlr 版本:3.2)
@members {
int cs = 0; // current state
@lexer::header {
package main;
Word :
{cs==0}?=> .+ ' ' {cs=1;} // in this state everything until
; // Space belongs to the Word, now go to Phon-mode
Phon :
{cs==1}?=> '[' .+ ']' {cs=2;} // everything in brackets is phonetic-information
; // after you have seen this go to next state
MultiDef :
{cs==2}?=> Int '.' .+ ':' {cs=3;}
Def :
{cs==2}?=> .+ ':' {cs=3;}
Digit :
Int :
Digit Digit*;
import org.antlr.runtime.ANTLRStringStream;
import org.antlr.runtime.CharStream;
import org.antlr.runtime.Token;
public class TestLexer {
public static void main(String[] args) {
String str = "Word [phon]1.definition:";
CharStream input = new ANTLRStringStream(str);
DudenLexer lexer = new DudenLexer(input);
Token token;
while ((token = lexer.nextToken())!=Token.EOF_TOKEN) {
System.out.println("Token: "+token);
- 我收到 error-msg: line 1:0 rule Def failed predicate: {cs==2}?
- 我不知道这是否是正确的做法?