8

我很难为树结构实现Read 。我想采用左关联字符串(带括号)ABC(DE)F并将其转换为树。该特定示例对应于树

树.

这是我正在使用的数据类型(尽管我愿意接受建议):

data Tree = Branch Tree Tree | Leaf Char deriving (Eq)

在 Haskell 中,那棵特定的树是:

example = Branch (Branch (Branch (Branch (Leaf 'A')
                                         (Leaf 'B'))
                                 (Leaf 'C'))
                         (Branch (Leaf 'D')
                                 (Leaf 'E')))
                 (Leaf 'F')

我的show功能如下:

instance Show Tree where
    show (Branch l r@(Branch _ _)) = show l ++ "(" ++ show r ++ ")"
    show (Branch l r) = show l ++ show r
    show (Leaf x) = [x]

我想做一个read函数,以便

read "ABC(DE)F" == example
4

3 回答 3

13

在这种情况下,使用解析库会使代码非常短且极具表现力。(当我尝试回答这个问题时,我很惊讶它是如此整洁!)

我将使用Parsec(该文章提供了一些链接以获取更多信息),并以“应用模式”(而不是单子模式)使用它,因为我们不需要单子的额外功率/射脚能力。

代码

首先是各种导入和定义:

import Text.Parsec

import Control.Applicative ((<*), (<$>))

data Tree = Branch Tree Tree | Leaf Char deriving (Eq, Show)

paren, tree, unit :: Parsec String st Tree

现在,树的基本单位是单个字符(不是括号)或带括号的树。带括号的树只是介于(和之间的普通树)。正常的树只是左关联地放入分支的单元(它非常自递归)。在使用 Parsec 的 Haskell 中:

-- parenthesised tree or `Leaf <character>`
unit = paren <|> (Leaf <$> noneOf "()") <?> "group or literal"

-- normal tree between ( and )
paren = between (char '(') (char ')') tree  

-- all the units connected up left-associatedly
tree = foldl1 Branch <$> many1 unit

-- attempt to parse the whole input (don't short-circuit on the first error)
onlyTree = tree <* eof

(是的,这就是整个解析器!)

如果我们愿意,我们可以不这样做parenunit但上面的代码非常有表现力,所以我们可以保持原样。

作为简要说明(我提供了文档的链接):

  • (<|>)基本上是指“左解析器或右解析器”;
  • (<?>)允许您制作更好的错误消息;
  • noneOf将解析不在给定字符列表中的任何内容;
  • between接受三个解析器,并返回第三个解析器的值,只要它由第一个和第二个分隔即可;
  • char从字面上解析它的论点。
  • many1将其一个或多个参数解析为一个列表(因此空字符串似乎无效many1,而不是many解析零个或多个);
  • eof匹配输入的结尾。

我们可以使用该parse函数来运行解析器(它返回Either ParseError Tree,Left是一个错误并且Right是一个正确的解析)。

作为read

将其用作read类似功能可能类似于:

read' str = case parse onlyTree "" str of
   Right tr -> tr
   Left er -> error (show er)

(我曾经read'避免与Prelude.read; 如果你想要一个Read实例,你将不得不做更多的工作来实现readPrec(或任何需要的),但它不应该太难与实际解析已经完成。)

例子

一些基本的例子:

*Tree> read' "A"
Leaf 'A'

*Tree> read' "AB"
Branch (Leaf 'A') (Leaf 'B')

*Tree> read' "ABC"
Branch (Branch (Leaf 'A') (Leaf 'B')) (Leaf 'C')

*Tree> read' "A(BC)"
Branch (Leaf 'A') (Branch (Leaf 'B') (Leaf 'C'))

*Tree> read' "ABC(DE)F" == example
True

*Tree> read' "ABC(DEF)" == example
False

*Tree> read' "ABCDEF" == example
False

展示错误:

*Tree> read' ""
***Exception: (line 1, column 1):
unexpected end of input
expecting group or literal

*Tree> read' "A(B"
***Exception: (line 1, column 4):
unexpected end of input
expecting group or literal or ")"

tree最后,和之间的区别onlyTree

*Tree> parse tree "" "AB)CD"     -- success: ignores ")CD"
Right (Branch (Leaf 'A') (Leaf 'B'))

*Tree> parse onlyTree "" "AB)CD" -- fail: can't parse the ")"
Left (line 1, column 3):
unexpected ')'
expecting group or literal or end of input

结论

秒差距是惊人的!这个答案可能很长,但它的核心只是完成所有工作的 5 或 6 行代码。

于 2012-05-02T08:28:25.190 回答
6

这非常像一个堆栈结构。当您遇到输入字符串"ABC(DE)F"时,您Leaf将找到任何原子(非括号)并将其放入累加器列表中。当列表中有 2 个项目时,将Branch它们放在一起。这可以通过以下方式完成(注意,未经测试,仅包括给出一个想法):

read' [r,l] str  = read' [Branch l r] str
read' acc (c:cs) 
   -- read the inner parenthesis
   | c == '('  = let (result, rest) = read' [] cs 
                 in read' (result : acc) rest
   -- close parenthesis, return result, should be singleton
   | c == ')'  = (acc, cs) 
   -- otherwise, add a leaf
   | otherwise = read' (Leaf c : acc) cs
read' [result] [] = (result, [])
read' _ _  = error "invalid input"

这可能需要一些修改,但我认为它足以让你走上正轨。

于 2012-05-02T05:48:30.570 回答
3

dbaupp 的解析秒回答很容易理解。作为“低级”方法的一个示例,这里是一个手写解析器,它使用成功延续来处理左关联树的构建:

instance Read Tree where readsPrec _prec s = maybeToList (readTree s)

type TreeCont = (Tree,String) -> Maybe (Tree,String)

readTree :: String -> Maybe (Tree,String)
readTree = read'top Just where
  valid ')' = False
  valid '(' = False
  valid _ = True

  read'top :: TreeCont -> String -> Maybe (Tree,String)
  read'top acc s@(x:ys) | valid x =
    case ys of
      [] -> acc (Leaf x,[])
      (y:zs) -> read'branch acc s
  read'top _ _ = Nothing

  -- The next three are mutually recursive

  read'branch :: TreeCont -> String -> Maybe (Tree,String)
  read'branch acc (x:y:zs) | valid x = read'right (combine (Leaf x) >=> acc) y zs
  read'branch _ _ = Nothing

  read'right :: TreeCont -> Char -> String -> Maybe (Tree,String)
  read'right acc y ys | valid y = acc (Leaf y,ys)
  read'right acc '(' ys = read'branch (drop'close >=> acc) ys
     where drop'close (b,')':zs) = Just (b,zs)
           drop'close _ = Nothing
  read'right _ _ _ = Nothing  -- assert y==')' here

  combine :: Tree -> TreeCont
  combine build (t, []) = Just (Branch build t,"")
  combine build (t, ys@(')':_)) = Just (Branch build t,ys)  -- stop when lookahead shows ')'
  combine build (t, y:zs) = read'right (combine (Branch build t)) y zs
于 2012-05-02T13:34:07.810 回答