1

I have this parser for string parsing using Haskell Parsec library.

myStringLiteral = lexeme (
        do str <- between (char '\'')
                  (char '\'' <?> "end of string")
                  (many stringChar)
                  ; return (U.replace "''" "'" (foldr (maybe id (:)) "" str))

        <?> "literal string"
        )

Strings in my language are defined as alpha-num characters inside of '' (example: 'this is my string'), but these string can also contain ' inside of it (in this case ' must be escaped by another ', ex 'this is my string with '' inside of it').

What I need to do, is to look forward when ' appears during parsing of string and decide, if there is another ' after or not (if no, return end of string). But I dont know how to do it. Any ideas? Thanks!

4

3 回答 3

5

If the syntax is as simple as it seems, you can make a special case for the escaped single quote,

escapeOrStringChar :: Parser Char
escapeOrStringChar = try (string "''" >> return '\'') <|> stringChar

and use that in

myStringLiteral = lexeme $ do
    char '\''
    str <- many escapeOrStringChar
    char '\'' <?> "end of string"
    return str
于 2012-03-02T16:50:40.510 回答
0

您可以为此使用stringLiteral

于 2012-11-23T01:51:05.903 回答
-1

Parsec deals only with LL(1) languages (details). It means the parser can look only one symbol a time. Your language is LL(2). You can write your own FSM for parsing your language. Or you can transform the text before parsing to make it LL(1).

In fact, Parsec is designed for syntactic analysis not lexical. The good idea is to make lexical analysis with other tool and than use Parsec for parsing the sequence of lexemes instead of sequence of chars.

于 2012-03-02T16:50:35.750 回答