我正在尝试解析一个可以包含转义字符的字符串,这是一个示例:
import qualified Data.Text as T
exampleParser :: Parser T.Text
exampleParser = T.pack <$> many (char '\\' *> escaped <|> anyChar)
where escaped = satisfy (\c -> c `elem` ['\\', '"', '[', ']'])
上面的解析器创建 aString
然后将其打包到Text
. 有没有办法使用 attoparsec 提供的高效字符串处理函数来解析带有上述转义的字符串?像,,,,,string
_ scan
_ runScanner
_takeWhile
...
解析类似的东西"one \"two\" \[three\]"
会产生one "two" [three]
.
更新:
感谢@epsilonhalbe,我能够提出一个适合我需求的通用解决方案;请注意,以下函数不会查找匹配的转义字符,例如[..]
, ".."
,(..)
等;而且,如果它发现一个无效的转义字符,它会将其\
视为文字字符。
takeEscapedWhile :: (Char -> Bool) -> (Char -> Bool) -> Parser Text
takeEscapedWhile isEscapable while = do
x <- normal
xs <- many escaped
return $ T.concat (x:xs)
where normal = Atto.takeWhile (\c -> c /= '\\' && while c)
escaped = do
x <- (char '\\' *> satisfy isEscapable) <|> char '\\'
xs <- normal
return $ T.cons x xs