我正在尝试为 SRT 格式创建语法:
以下是 srt 文件的示例:
1
00:00:02,218 --> 00:00:04,209
[SHELDON SPEAKING IN MANDARIN]
2
00:00:04,721 --> 00:00:05,745
No, it's:
3
00:00:05,922 --> 00:00:07,913
[SPEAKING IN MANDARIN]
4
00:00:09,392 --> 00:00:11,383
[SPEAKING IN MANDARIN]
5
00:00:13,430 --> 00:00:15,193
What's this?
6
00:00:16,266 --> 00:00:18,029
That's what you did.
7
00:00:18,201 --> 00:00:22,467
I assumed, as in a number of languages,
that the gesture was part of the phrase.
8
00:00:22,639 --> 00:00:25,233
- Well, it's not.
- Why am I supposed to know that?
9
00:00:25,408 --> 00:00:28,900
As teacher, it's your obligation
to separate your personal idiosyncrasies...
10
00:00:29,079 --> 00:00:30,512
...from the subject matter.
11
00:00:31,081 --> 00:00:33,845
- I'm glad you decided to learn Mandarin.
- Why?
326
00:18:56,818 --> 00:19:00,720
Actually, I've heard
far too much about Schrödinger's cat.
327
00:19:01,623 --> 00:19:03,022
Good.
328
00:19:09,131 --> 00:19:11,895
All right, the cat's alive.
Let's go to dinner.
329
00:19:12,000 --> 00:19:15,072
Download Movie Subtitles Searcher from www.OpenSubtitles.org
这是我的 antlr 语法(v. 3.4)。
grammar Exp;
parse
: (SUBTITLE)+
;
SUBTITLE
: i=ID NL
t1=Timestamp SPACE ARROW SPACE t2=Timestamp NL
txt1 = TEXT
{
System.out.println("id="+$i);
System.out.println("t1= "+$t1);
System.out.println("t2= "+$t2);
System.out.println("txt1= "+$txt1);
}
;
TEXT
: ((TextLine NL NL)|(TextLine NL TextLine NL NL))
;
ID
: DIG+
;
ARROW
: '-->'
;
Timestamp
: DIG DIG ':' DIG DIG ':' DIG DIG ',' DIG DIG DIG
;
TextLine
: ~('\r' | '\n')*
;
NL
: '\r'? '\n'
| '\r'
;
fragment
DIG
: '0'..'9'
;
fragment
SPACE
: ' ' | '\t'
;
我的简单代码:
String input = IOUtils.toString(Test.class.getResourceAsStream("/subtitles.srt"));
ExpLexer lexer = new ExpLexer(new ANTLRStringStream(input));
CommonTokenStream stream = new CommonTokenStream(lexer);
ExpParser parser = new ExpParser(stream);
parser.parse();
如果在文件末尾我有两条新行,几乎所有东西都可以完美运行。如果没有,我收到此错误:
line 1484:0 no viable alternative at character '<EOF>'
有什么建议可以让我的语法更灵活吗?接受最后将是一个新行,两个新行或更多。