对于这个项目,我分两个阶段进行解析。第一阶段处理 include/ifdef/define 指令并将输入分成[Span]
项目,这些项目在原始输入中定义它们的起点/终点以及正文。然后第二阶段将此流解析为我的 AST 以进行后续处理。
AST 的每个元素都带有它的源位置,并且在解析后捕获的任何语义错误都会打印正确的错误位置,而不管包含深度如何。这部分至关重要,因为它出现在出现问题的阶段之后。
该问题在第二阶段从包含的文件中给出了一个解析错误,它报告了一个虚假错误,其位置位于输入的顶级规则中。初始文件中的解析错误可以正常工作。任何指令的存在甚至会将初始文件分成多个块,因此这不是“单个块”与“多个块”的问题。
鉴于 AST 正在获取正确的位置这一事实,我很难理解 Megaparsec 在遇到解析错误时如何报告错误信息。
我包含了我的流实例和 (set|get)(Position|Input) 代码,因为这些看起来像是相关的位。我觉得一定有一些我没有做的兆秒内务处理,或者我的 Stream 实例由于某种原因无效。
data Span = Span
{ spanStart :: SourcePos
, spanEnd :: SourcePos
, spanBody :: T.Text
} deriving (Eq, Ord, Show)
instance Stream [Span] where
type Token [Span] = Span
type Tokens [Span] = [Span]
tokenToChunk Proxy = pure
tokensToChunk Proxy = id
chunkToTokens Proxy = id
chunkLength Proxy = foldl1 (+) . map (T.length . spanBody)
chunkEmpty Proxy = all ((== 0) . T.length . spanBody)
positionAt1 Proxy pos (Span start _ _) = trace ("pos1" ++ show start) start
positionAtN Proxy pos [] = pos
positionAtN Proxy _ (Span start _ _:_) = trace ("posN" ++ show start) start
advance1 Proxy _ _ (Span _ end _) = end
advanceN Proxy _ pos [] = pos
advanceN Proxy _ _ ts = let Span _ end _ = last ts in end
take1_ [] = Nothing
take1_ s = case takeN_ 1 s of
Nothing -> Nothing
Just (sp, s') -> Just (head sp, s')
takeN_ _ [] = Nothing
takeN_ n s@(t:ts)
| s == [] = Nothing
| n <= 0 = Just ([t {spanEnd = spanStart t, spanBody = ""}], s)
| n < (T.length . spanBody) t = let (l, r) = T.splitAt n (spanBody t)
sL = spanStart t
eL = foldl (defaultAdvance1 (mkPos 3)) sL (T.unpack (T.tail l))
sR = defaultAdvance1 (mkPos 3) eL (T.last l)
eR = spanEnd t
l' = [Span sL eL l]
r' = (Span sR eR r):ts
in Just (trace (show n) l', r')
| n == (T.length . spanBody) t = Just ([t], ts)
| otherwise = case takeN_ (n - T.length (spanBody t)) ts of
Nothing -> Just ([t], [])
Just (t', ts') -> Just (t:t', ts')
takeWhile_ p s = fromJust $ takeN_ (go 0 s) s
where go n s = case take1_ s of
Nothing -> n
Just (c, s') -> if p c
then go (n + 1) s'
else n
找到包含并交换它:
"include" -> do
file <- between dquote dquote (many (alphaNumChar <|> char '.' <|> char '/' <|> char '_'))
s <- liftIO (Data.Text.IO.readFile file)
p <- getPosition
i <- getInput
pushPosition p
stack %= (:) (p, i)
setPosition (initialPos file)
setInput s
如果我们到达输入弹出堆栈的末尾并继续:
parseStream' :: StreamParser [Span]
parseStream' = concat <$> many p
where p = do
b <- tick <|> block
end <- option False (True <$ hidden eof)
h <- use stack
when (end && (h /= [])) $ do
popPosition
setInput (h ^?! ix 0 . _2)
stack %= tail
return b