2

我试图了解 HXT,这是一个用于解析使用箭头的 XML 的 Haskell 库。对于我的特定用例,我宁愿不使用deep,因为有些情况<outer_tag><payload_tag>value</payload_tag></outer_tag>是不同的,<outer_tag><inner_tag><payload_tag>value</payload_tag></inner_tag></outer_tag>但我遇到了一些奇怪的感觉,感觉它应该工作但没有。

我已经设法根据文档中的这个示例提出了一个测试用例:

{-# LANGUAGE Arrows, NoMonomorphismRestriction #-}
module Main where

import Text.XML.HXT.Core

data Guest = Guest { firstName, lastName :: String }
  deriving (Show, Eq)


getGuest = deep (isElem >>> hasName "guest") >>> 
  proc x -> do
    fname <- getText <<< getChildren <<< deep (hasName "fname") -< x
    lname <- getText <<< getChildren <<< deep (hasName "lname") -< x
    returnA -< Guest { firstName = fname, lastName = lname }

getGuest' = deep (isElem >>> hasName "guest") >>> 
  proc x -> do
    fname <- getText <<< getChildren <<< (hasName "fname") <<< getChildren -< x
    lname <- getText <<< getChildren <<< (hasName "lname") <<< getChildren -< x
    returnA -< Guest { firstName = fname, lastName = lname }

getGuest'' = deep (isElem >>> hasName "guest") >>> getChildren >>>
  proc x -> do
    fname <- getText <<< getChildren <<< (hasName "fname") -< x
    lname <- getText <<< getChildren <<< (hasName "lname") -< x
    returnA -< Guest { firstName = fname, lastName = lname }


driver finalArrow = runX (readDocument [withValidate no] "guestbook.xml" >>> finalArrow)

main = do 
  guests <- driver getGuest
  print "getGuest"
  print guests

  guests' <- driver getGuest'
  print "getGuest'"
  print guests'

  guests'' <- driver getGuest''
  print "getGuest''"
  print guests''

之间getGuestgetGuest'我扩展deep到正确的数量getChildren。结果函数仍然有效。getChildren然后我将块的外部因素考虑在内,do但这会导致结果函数失败。输出是:

"getGuest"
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}]
"getGuest'"
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}]
"getGuest''"
[]

我觉得这应该是一个有效的转换来执行,但我对箭头的理解有点不稳定。难道我做错了什么?这是我应该报告的错误吗?

我正在使用 HXT 版本 9.3.1.3(撰写本文时的最新版本)。ghc --version 打印“The Glorious Glasgow Haskell Compilation System, version 7.4.1”。我还在一个装有 ghc 7.6.3 的盒子上进行了测试,并得到了相同的结果。

XML 文件具有以下重复结构(可在此处找到完整文件)

<guestbook>
  <guest>
    <fname>John</fname>
    <lname>Steinbeck</lname>
  </guest>
  <guest>
    <fname>Henry</fname>
    <lname>Ford</lname>
  </guest>
  <guest>
    <fname>Andrew</fname>
    <lname>Carnegie</lname>
  </guest>
</guestbook>
4

2 回答 2

3

getGuest''你有

... (hasName "fname") -< x
... (hasName "lname") -< x

也就是说,您限制在xis"fname" xis的情况下"lname",这不满足 any x

于 2014-02-24T21:43:12.417 回答
2

我已经设法找出解释结构的具体原因。此处找到的以下箭头翻译提供了工作的基础

addA :: Arrow a => a b Int -> a b Int -> a b Int
addA f g = proc x -> do
                y <- f -< x
                z <- g -< x
                returnA -< y + z

变成:

addA :: Arrow a => a b Int -> a b Int -> a b Int
addA f g = arr (\ x -> (x, x)) >>>
           first f >>> arr (\ (y, x) -> (x, y)) >>>
           first g >>> arr (\ (z, y) -> y + z)

由此我们可以类推得出:

getGuest''' = preproc >>>
           arr (\ x -> (x, x)) >>>
           first f >>> arr (\ (y, x) -> (x, y)) >>>
           first g >>> arr (\ (z, y) -> Guest {firstName = z, lastName = y})

    where preproc = deep (isElem >>> hasName "guest") >>> getChildren
        f = getText <<< getChildren <<< (hasName "fname")
        g = getText <<< getChildren <<< (hasName "lname")

在 HXT 中,箭头可以想象为流经过滤器的值流。arr (\x->(x,x))不会像我希望的那样“分流”。相反,它创建了一个元组流,这些元组被过滤f,幸存者被过滤g。由于fg是相互排斥的,因此没有幸存者。

带有 inside 的示例getChildren奇迹般地起作用,因为元组流包含来自 XML 文档更上方的值,看起来像

<guest>
    <fname>John</fname>
    <lname>Steinbeck</lname>
</guest>

所以不是相互排斥的。

于 2014-02-25T00:31:32.353 回答