0

I've got a silly situation in my parsec parsers that I would like your help on.

I need to parse a sequence of strongs / chars that are separated by | characters. So, we could have a|b|'c'|'abcd'

which should be turned into

[a,b,c,abcd]

Space is not allowed, unless inside of a ' ' string. Now, in my naïve attempt, I got the situation now where I can parse strings like a'a|'bb' to [a'a,bb] but not aa|'b'b' to [aa,b'b].

singleQuotedChar :: Parser Char
singleQuotedChar = noneOf "'" <|> try (string "''" >> return '\'')

simpleLabel = do
    whiteSpace haskelldef
    lab <- many1 (noneOf "|")
    return $ lab

quotedLabel = do
    whiteSpace haskelldef
    char '\''
    lab <- many singleQuotedChar
    char '\''
    return $ lab

Now, how do I tell the parser to consider ' a stoping ' iff it is followed by a | or white space? (Or, get some ' char counting into this). The input is user generated, so I cannot rely on them \'-ing chars.

4

1 回答 1

1

请注意,允许在由引号分隔的字符串中间使用引号会非常令人困惑,但我相信这应该允许您解析它。

quotedLabel = do -- reads the first quote.
    whiteSpace
    char '\''
    quotedLabel2

quotedLabel2 = do -- reads the string and the finishing quote.
    lab <- many singleQuotedChar
    try  (do more <- quotedLabel3
             return $ lttrace "quotedLabel2" (lab ++ more))
     <|> (do char '\''
             return $ lttrace "quotedLabel2" lab)


quotedLabel3 = do -- handle middle quotes
    char '\''
    lookAhead $ noneOf ['|']
    ret <- quotedLabel2
    return $ lttrace "quotedLabel3" $ "'" ++ ret
于 2014-07-27T14:32:46.863 回答