8

I'm wondering if Scalas/Haskells parser combinators are sufficient for parsing a programming language. More specifically the language MiniJava. I'm currently reading compiller construction and jflex and java cup is quite painful to work with so I'm wondering if I could/should use parser combinators instead. The MiniJava syntax is very small. MiniJavas BNF: http://www.cambridge.org/us/features/052182060X/grammar.html

4

6 回答 6

11

I've never used Scala, but the existence of a definitive BNF makes this easy.

Trivially translated into Haskell's Text.ParserCombinators.Parsec:

goal = do c <- mainClass
          cs <- many classDeclaration
          eof
          return $ c:cs
mainClass = do token "class"
               name <- identifier
               ...

etc. The PArrows translation is pretty trivial too. You'll probably find it easier to have a distinct lexing phase before the parser, but you can do without too.

于 2009-01-29T02:06:58.570 回答
6

I'm using Scala's parser combinators to parse PL/SQL code, it works like a charm.

于 2009-01-29T20:52:52.067 回答
5

At least Parsec has built-in lexer for Java-like languages:

lexer = makeTokenParser javaStyle

You have to define the reserved words yourself.

于 2009-01-29T08:50:27.850 回答
4

Scala's parser is a backtracking parser, so it can deal with pretty much any BNF or EBNF. It also means, though, that there are edge cases where input can be painfully slow to be read.

If the grammar can be changed into an LL(1) grammar, you can use the ~! operator to keep backtracking to a minimum.

The grammar probably CAN be turned into LL(1), but, as written, it is not. See, for instance, that Expression and Statement have First/First conflicts (look this up at the end of the linked article).

Anyway, for an academic project, it is enough. For real life compiler stuff, you'll need faster parsers.

于 2009-07-08T21:58:58.487 回答
2

Programming in Scala (p. 647) says:

It [Scala's parser combinator framework] is much easier to understand and to adapt than a parser generator, and the difference in speed would often not matter in practice, unless you want to parse very large inputs.

As I would not classify source code as very large input (ideally), it should be sufficient.

于 2009-01-28T23:07:31.397 回答
0

I haven't dealt with the Scala or Haskell parser combinator libraries, but it looks like the grammar should be fine.

于 2009-01-28T23:05:07.470 回答