6

我喜欢通过粘贴到解释器中来临时解析 Python 中的字符串。

>>> s = """Adams, John
... Washington,George
... Lincoln,Abraham
... Jefferson, Thomas
... """
>>> print "\n".join(x.split(",")[1].replace(" ", "")
                    for x in s.strip().split("\n"))
John
George
Abraham
Thomas

这在使用 Python 解释器时效果很好,但我想用 Haskell/GHCi 来做这件事。问题是,我不能粘贴多行字符串。我可以将 getContents 与 EOF 字符一起使用,但我只能执行一次,因为 EOF 字符会关闭标准输入。

Prelude> s <- getContents
Prelude> s
"Adams, John
Adams, John\nWashington,George
Washington,George\nLincoln,Abraham
Lincoln,Abraham\nJefferson, Thomas
Jefferson, Thomas\n^Z
"
Prelude> :{
Prelude| putStr $ unlines $ map ((filter (`notElem` ", "))
Prelude|                         . snd . (break (==','))) $ lines s
Prelude| :}
John
George
Abraham
Thomas
Prelude> x <- getContents
*** Exception: <stdin>: hGetContents: illegal operation (handle is closed)

有没有更好的方法来用 GHCi 做到这一点?注意 - 我对 getContents (以及一般的 Haskell IO)的理解可能被严重破坏。

更新

我会玩弄我收到的答案。这是我制作(抄袭)的一些辅助函数,它们模拟 Python从 ehemient 的答案中"""引用(以 结尾,而不是开头)。"""

getLinesWhile :: (String -> Bool) -> IO String
getLinesWhile p = liftM unlines $ takeWhileM p (repeat getLine)

getLines :: IO String
getLines = getLinesWhile (/="\"\"\"")

要在 GHCi 中使用 AndrewC 的答案 -

C:\...\code\haskell> ghci HereDoc.hs -XQuasiQuotes
ghci> :{
*HereDoc| let s = [heredoc|
*HereDoc| Adams, John
*HereDoc| Washington,George
*HereDoc| Lincoln,Abraham
*HereDoc| Jefferson, Thomas
*HereDoc| |]
*HereDoc| :}
ghci> putStrLn s
Adams, John
Washington,George
Lincoln,Abraham
Jefferson, Thomas
ghci> :{
*HereDoc| putStr $ unlines $ map ((filter (`notElem` ", "))
*HereDoc|                         . snd . (break (==','))) $ lines s
*HereDoc| :}
John
George
Abraham
Thomas
4

2 回答 2

6

getContents == hGetContents stdin. Unfortunately, hGetContents marks its handle as (semi-)closed, which means anything attempting to read from stdin ever again will fail.

Does it suffice to simply read up to an empty line or some other marker, never closing stdin?

takeWhileM :: Monad m => (a -> Bool) -> [m a] -> m [a]
takeWhileM p (ma : mas) = do
    a <- ma
    if p a
      then liftM (a :) $ takeWhileM p mas
      else return []
takeWhileM _ _ = return []
ghci> liftM unlines $ takeWhileM (not . null) (repeat getLine)
Adams, John
Washington, George
Lincoln, Abraham
Jefferson, Thomas

"Adams, John\nWashington, George\nLincoln, Abraham\nJefferson, Thomas\n"
ghci>
于 2012-08-25T06:33:00.513 回答
2

If you do this a lot, and you're writing helper functions in some module anyway, why not go the whole hog and use your editor for the raw data too:

{-# LANGUAGE TemplateHaskell, QuasiQuotes #-}
module ParseAdHoc where
import HereDoc
import Data.Char (isSpace)
import Data.List (intercalate,intersperse)  -- other handy helpers

-- ------------------------------------------------------
-- edit this bit every time you do your ad-hoc parsing

adhoc :: String -> String
adhoc = head . splitOn ',' . rmspace

input = [heredoc|
Adams, John
Washington,George
Lincoln,Abraham
Jefferson, Thomas
|]

-- ------------------------------------------------------
-- add other helpers you'll reuse here

main = mapM_ putStrLn.map adhoc.lines $ input

rmspace = filter (not.isSpace)

splitWith :: (a -> Bool) -> [a] -> [[a]]   -- splits using a function that tells you when
splitWith isSplitter list =  case dropWhile isSplitter list of
  [] -> []
  thisbit -> firstchunk : splitWith isSplitter therest
    where (firstchunk, therest) = break isSplitter thisbit

splitOn :: Eq a => a -> [a] -> [[a]]       -- splits on the given item
splitOn c = splitWith (== c)

splitsOn :: Eq a => [a] -> [a] -> [[a]]    -- splits on any of the given items
splitsOn chars = splitWith (`elem` chars)

It would be easier to use takeWhile (/=',') instead of head . splitOn ',', but I thought that splitOn will be more useful to you in the future.

This uses a helper module, HereDoc, that lets you paste multiline string literals into your code (like perl's <<"EOF" or python's """). I can't remember how I found how to do this, but I've tweaked it to remove whitespace first and last lines, so I can start and end my data with a newline.

module HereDoc where
import Language.Haskell.TH
import Language.Haskell.TH.Quote
import Data.Char (isSpace)

{-
example1 = [heredoc|Hi.
This is a multi-line string.
It should appear as an ordinary string literal.

Remember you can only use a QuasiQuoter
in a different module, so import this HereDoc module 
into something else and don't forget the
{-# LANGUAGE TemplateHaskell, QuasiQuotes #-}|]

example2 = [heredoc|         
This heredoc has no newline characters in it because empty or whitespace-only first and last lines are ignored
                   |]
-}


heredoc = QuasiQuoter {quoteExp = stringE.topAndTail,
                       quotePat = litP . stringL,
                       quoteType = undefined,
                       quoteDec = undefined}

topAndTail = myunlines.tidyend.tidyfront.lines

tidyfront :: [String] -> [String]
tidyfront [] = []
tidyfront (xs:xss) | all isSpace xs = xss
                   | otherwise      = xs:xss

tidyend :: [String] -> [String]
tidyend [] = []
tidyend [xs]     | all isSpace xs = []
                 | otherwise = [xs]
tidyend (xs:xss) = xs:tidyend xss

myunlines :: [String] -> String
myunlines [] = ""
myunlines (l:ls) = l ++ concatMap ('\n':) ls

You might find Data.Text a good source of (inspiration for) helper functions: http://hackage.haskell.org/packages/archive/text/latest/doc/html/Data-Text.html

于 2012-08-25T17:44:53.067 回答