0

我想解决我目前维护的rdf4h库的一个错误。它支持在XmlParser模块中将 XML/RDF 文档解析为 RDF 图,但不能成功解析包含 XML 规范标头的 XML/RDF 文档,例如

<?xml version="1.0" encoding="ISO-8859-1"?>

解析器使用HXT箭头接口,即Text.XML.HXT.Core 模块。我将问题归结为在函数testSuccesstestFailure. 两者都使用runSLA。hxt的作者告诉我,问题出在使用xread,我应该首先从之前的字符串中提取XML文档xread。(不幸的是,他没有回应我提出的关于这个问题的GitHub 问题)。

下面有两个字符串,都包含相同的 XML 文档。该xmlDoc1字符串包含一个规范标头,该标头xread会使testFailure.

module HXTProblem where

import Text.XML.HXT.Core

data GParseState = GParseState { stateGenId :: Int } deriving(Show)

-- this document has an XML specification included
xmlDoc1 :: String
xmlDoc1 = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>" ++
          "<shiporder orderid=\"889923\" " ++
          "xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" " ++
          "xsi:noNamespaceSchemaLocation=\"shiporder.xsd\">" ++
          "<orderperson>John Smith</orderperson>" ++
             "<shipto>" ++
               "<name>Ola Nordmann</name>" ++
             "</shipto>" ++
          "</shiporder>"

-- this document does not include the XML specification
xmlDoc2 :: String
xmlDoc2 = "<shiporder orderid=\"889923\" " ++
          "xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" " ++
          "xsi:noNamespaceSchemaLocation=\"shiporder.xsd\">" ++
          "<orderperson>John Smith</orderperson>" ++
             "<shipto>" ++
               "<name>Ola Nordmann</name>" ++
             "</shipto>" ++
          "</shiporder>"

initState :: GParseState
initState = GParseState { stateGenId = 0 }

-- | Works
testSuccess :: (GParseState,[XmlTree])
testSuccess = runSLA xread initState xmlDoc2

{- output of runnnig testSuccess
(GParseState {stateGenId = 0},[NTree (XTag "shiporder" [NTree (XAttr "orderid") [NTree (XText "889923") []],NTree (XAttr "xmlns:xsi") [NTree (XText "http://www.w3.org/2001/XMLSchema-instance") []],NTree (XAttr "xsi:noNamespaceSchemaLocation") [NTree (XText "shiporder.xsd") []]]) [NTree (XTag "orderperson" []) [NTree (XText "John Smith") []],NTree (XTag "shipto" []) [NTree (XTag "name" []) [NTree (XText "Ola Nordmann") []]]]]
-}

-- | Does not work
testFailure :: (GParseState,[XmlTree])
testFailure = runSLA xread initState xmlDoc1

{- ERROR running testFailure
(GParseState {stateGenId = 0},[NTree (XError 2 "\"string: \"<?xml version=\\\"1.0\\\" encoding=\\\"ISO-8859-1...\"\" (line 1, column 6):\nunexpected xml\nexpecting legal XML name character\n") []])
-}

我应该补充一点,我正在寻找一种解决方案,使用它在解析or时runSLA会产生相同的结果。XMLTreexmlDoc1xmlDoc2

4

1 回答 1

1

万岁,这个问题解决了。HXT 库的作者已经解决了GitHub 问题xreadDoc,在此提交中添加了一个新的解析器。我已经修复了 rdf4h 库版本 1.2.2 及更高版本,在此提交中使用了这个新的解析器,因此 XML/RDF 文档(带有规范和编码标题)现在可以用XmlParser.

testFailure请注意 ,中的新箭头组合(xreadDoc >>> isElem)

module HXTProblem where

import Text.XML.HXT.Core

data GParseState = GParseState { stateGenId :: Int } deriving(Show)

-- this document has an XML specification included
xmlDoc1 :: String
xmlDoc1 = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>" ++
          "<shiporder orderid=\"889923\" " ++
          "xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" " ++
          "xsi:noNamespaceSchemaLocation=\"shiporder.xsd\">" ++
          "<orderperson>John Smith</orderperson>" ++
             "<shipto>" ++
               "<name>Ola Nordmann</name>" ++
             "</shipto>" ++
          "</shiporder>"

-- this document does not include the XML specification
xmlDoc2 :: String
xmlDoc2 = "<shiporder orderid=\"889923\" " ++
          "xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" " ++
          "xsi:noNamespaceSchemaLocation=\"shiporder.xsd\">" ++
          "<orderperson>John Smith</orderperson>" ++
             "<shipto>" ++
               "<name>Ola Nordmann</name>" ++
             "</shipto>" ++
          "</shiporder>"

initState :: GParseState
initState = GParseState { stateGenId = 0 }

-- | Works
testSuccess :: (GParseState,[XmlTree])
testSuccess = runSLA xread initState xmlDoc2

-- | Does also now work!
testFailure :: (GParseState,[XmlTree])
testFailure = runSLA (xreadDoc >>> isElem) initState xmlDoc1

testEquality :: Bool
testEquality =
    let (_,x) = testSuccess
        (_,y) = testFailure
    in x == y
于 2013-11-06T17:02:23.013 回答