4

作为这个问题的后续,我现在正在尝试解析具有变量和case ... of ...表达式的表达式语言。语法应该是基于缩进的:

  • 表达式可以跨越多行,只要每一行都相对于第一行缩进即可;即这应该被解析为单个应用程序:

    f x y
      z
     q
    
  • 表达式的每个替代项都case需要在自己的行上,相对于case关键字缩进。右侧可以跨越多条线。

    case E of
      C -> x
      D -> f x
       y
    

    应该被解析成一个case有两种选择的单一的,用xf x y作为右手边

我已将代码简化为以下内容:

import qualified Text.Megaparsec.Lexer as L
import Text.Megaparsec hiding (space)
import Text.Megaparsec.Char hiding (space)
import Text.Megaparsec.String
import Control.Monad (void)
import Control.Applicative

data Term = Var String
          | App [Term]
          | Case Term [(String, Term)]
          deriving Show

space :: Parser ()
space = L.space (void spaceChar) empty empty

name :: Parser String
name = try $ do
    s <- some letterChar
    if s `elem` ["case", "of"]
      then fail $ unwords ["Unexpected: reserved word", show s]
      else return s

term :: Parser () -> Parser Term
term sp = App <$> atom `sepBy1` try sp
  where
    atom = choice [ caseBlock
                  , Var <$> L.lexeme sp name
                  ]

    caseBlock = L.lineFold sp $ \sp' ->
        Case <$>
        (L.symbol sp "case" *> L.lexeme sp (term sp) <* L.symbol sp' "of") <*>
        alt sp' `sepBy` try sp' <* sp

    alt sp' = L.lineFold sp' $ \sp'' ->
        (,) <$> L.lexeme sp' name <* L.symbol sp' "->" <*> term sp'' 

如您所见,我正在尝试使用此答案中的技术来将具有比关键字缩进更多alt的 ace 的替代词分开。sp'case

问题

这似乎适用于仅由应用程序组成的单个表达式:

λ» parseTest (L.lineFold space term) "x y\n z"
App [Var "x",Var "y",Var "z"]

它不适用于使用链接答案中的技术列出此类表达式:

λ» parseTest (L.lineFold space $ \sp -> (term sp `sepBy` try sp)) "x\n y\nz"
3:1:
incorrect indentation (got 1, should be greater than 1)

case尝试使用行折叠时,表达式失败了:

λ» parseTest (L.lineFold space term) "case x of\n C -> y\n D -> z"
1:5:
Unexpected: reserved word "case"

case最外层表达式无需折线即可工作,仅适用于一种选择:

λ» parseTest (term space) "case x of\n C -> y\n  z"
App [Case (App [Var "x"]) [("C",App [Var "y",Var "z"])]]

但是case一旦我有多个alt替代品就会失败:

λ» parseTest (term space) "case x of\n C -> y\n D -> z"
3:2:
incorrect indentation (got 2, should be greater than 2)

我究竟做错了什么?

4

1 回答 1

1

我正在回答,因为我答应看看这个。对于当前状态下的类 Parsec 解析器来说,这个问题是一个相当困难的问题。在花费更多可用时间后,我可能可以使它工作,但是在我可以花时间回答这个问题的时间里,我只做到了这一点:

module Main (main) where

import Control.Applicative
import Control.Monad (void)
import Text.Megaparsec
import Text.Megaparsec.String
import qualified Data.List.NonEmpty    as NE
import qualified Text.Megaparsec.Lexer as L

data Term = Var String
          | App [Term]
          | Case Term [(String, Term)]
          deriving Show

scn :: Parser ()
scn = L.space (void spaceChar) empty empty

sc :: Parser ()
sc = L.space (void $ oneOf " \t") empty empty

name :: Parser String
name = try $ do
  s <- some letterChar
  if s `elem` ["case", "of"]
    then (unexpected . Label . NE.fromList) ("reserved word \"" ++ s ++ "\"")
    else return s

manyTerms :: Parser [Term]
manyTerms = many pTerm

pTerm :: Parser Term
pTerm = caseBlock <|> app -- parse a term first

caseBlock :: Parser Term
caseBlock = L.indentBlock scn $ do
  void (L.symbol sc "case")
  t <- Var <$> L.lexeme sc name -- not sure what sort of syntax case of
       -- case expressions should have, so simplified to vars for now
  void (L.symbol sc "of")
  return (L.IndentSome Nothing (return . Case t) alt)

alt :: Parser (String, Term)
alt = L.lineFold scn $ \sc' ->
  (,) <$> L.lexeme sc' name <* L.symbol sc' "->" <*> pTerm -- (1)

app :: Parser Term
app = L.lineFold scn $ \sc' ->
  App <$> ((Var <$> name) `sepBy1` try sc' <* scn)
  -- simplified here, with some effort should be possible to go from Var to
  -- more general Term in applications

您的原始语法是左递归的,因为每个术语都可以是案例表达式或应用程序,如果它是应用程序,那么它的第一部分又可以是案例表达式或应用程序等。您需要处理不知何故。

这是一个会话:

λ> parseTest pTerm "x y\n z"
App [Var "x",Var "y",Var "z"]
λ> parseTest pTerm "x\n y\nz"
App [Var "x",Var "y"]
λ> parseTest manyTerms "x\n y\nz"
[App [Var "x",Var "y"],App [Var "z"]]
λ> parseTest pTerm "case x of\n C -> y\n D -> z"
Case (Var "x") [("C",App [Var "y"]),("D",App [Var "z"])]
λ> parseTest pTerm "case x of\n C -> y\n  z"
3:3:
incorrect indentation (got 3, should be equal to 2)

最后一个结果是因为(1)在代码中。引入参数 toapp使得在不考虑上下文的​​情况下无法使用它(它将不再是独立的表达式,而是某些东西的分解部分)。我们可以看到,如果您缩进应用程序z的开始y,而不是整个替代方案,它会起作用:

λ> parseTest pTerm "case x of\n C -> y\n           z"
Case (Var "x") [("C",App [Var "y",Var "z"])]

最后,案例表达有效:

λ> parseTest pTerm "case x of\n C -> y\n D -> z"
Case (Var "x") [("C",App [Var "y"]),("D",App [Var "z"])]

我的建议是查看一些预处理器并在此基础上使用 Megaparsec。在这种情况下,其中的工具Text.Megaparsec.Lexer并不容易应用,但它们是我们能想到的最好的工具,它们适用于简单的缩进敏感语法。

于 2016-12-04T12:40:48.677 回答