haskell - 跳过管道中的第一行 - attoparsec

Question

我的类型：

data Test = Test {
 a :: Int,
 b :: Int
} deriving (Show)

我的解析器：

testParser :: Parser Test
testParser = do
  a <- decimal
  tab
  b <- decimal
  return $ Test a b

tab = char '\t'

现在为了跳过第一行，我做了这样的事情：

import qualified System.IO as IO    

parser :: Parser Test
parser = manyTill anyChar endOfLine *> testParser

main = IO.withFile testFile IO.ReadMode $ \testHandle -> runEffect $
         for (parsed (parser <* endOfLine) (fromHandle testHandle)) (lift . print)

但是上面的parser函数使每个备用链接都跳过（这很明显）。如何仅以与 Pipes 生态系统一起使用的方式跳过第一行（Producer应该产生单个Test值。）这是我不想要的一个明显的解决方案（以下代码仅在我修改 testParser 以读取换行符时才有效) 因为它返回整个[Test]而不是单个值：

tests :: Parser [Test]
tests = manyTill anyChar endOfLine *>
        many1 testParser

有什么想法可以解决这个问题吗？

score 5 · Accepted Answer

如果第一行不包含任何 valid Test，您可以使用Either () Test它来处理它：

parserEither :: Parser (Either () Test)
parserEither = Right <$> testParser <* endOfLine 
           <|> Left <$> (manyTill anyChar endOfLine *> pure ())

在此之后，您可以使用提供的函数Pipes.Prelude来摆脱第一个结果（以及所有不可解析的行）：

producer p = parsed parserEither p 
         >-> P.drop 1 
         >-> P.filter (either (const False) (const True))
         >-> P.map    (\(Right x) -> x)

main = IO.withFile testFile IO.ReadMode $ \testHandle -> runEffect $
         for (producer (fromHandle testHandle)) (lift . print)

score 5 · Accepted Answer

您可以像这样在恒定空间中有效地删除第一行：

import Lens.Family (over)
import Pipes.Group (drops)
import Pipes.ByteString (lines)
import Prelude hiding (lines)

dropLine :: Monad m => Producer ByteString m r -> Producer ByteString m r
dropLine = over lines (drops 1)

您可以在解析之前申请dropLine您的，如下所示：ProducerProducer

main = IO.withFile testFile IO.ReadMode $ \testHandle -> runEffect $
    let p = dropLine (fromHandle testHandle)
    for (parsed (parser <* endOfLine) p) (lift . print)

haskell - 跳过管道中的第一行 - attoparsec

2 回答 2

Related

Reference