3

我写了一个按字谜对单词进行分组的 Haskell 函数。我正在尝试学习 OCaml,但我对如何在 OCaml 中使用模式匹配感到有些困惑。有人可以帮我把它翻译成 OCaml 吗?谢谢!

此函数获取一个字符串列表,并将其划分为一个字符串列表列表,按字谜分组。

import Data.List

groupByAnagrams :: [String] -> [[String]]
groupByAnagrams []     = []
groupByAnagrams (x:xs) = let (listOfAnagrams, listOfNonAnagrams) = (partitionByAnagrams (sort x) xs)
                         in 
                         (x:listOfAnagrams):(groupByAnagrams listOfNonAnagrams)

这个辅助函数接受一个排序的字符串sortedStr和一个字符串列表(字符串排序的原因是我不必在每次迭代时都对其调用排序)。字符串列表被分成两个列表;一个由 anagrams 的字符串sortedStr组成,另一个由不是的字符串组成。该函数返回由这两个列表组成的元组。

partitionByAnagrams :: String -> [String] -> ([String], [String])
partitionByAnagrams sortedStr []     = ([], [])
partitionByAnagrams sortedStr (x:xs) 
         | (sortedStr == (sort x))   = let (listOfAnagrams, listOfNonAnagrams) = (partitionByAnagrams sortedStr xs)
                                       in
                                       (x:listOfAnagrams, listOfNonAnagrams)
         | otherwise                 = let (listOfAnagrams, listOfNonAnagrams) = (partitionByAnagrams sortedStr xs)
                                       in
                                       (listOfAnagrams, x:listOfNonAnagrams)

这只是一个测试用例:

test1 = mapM_ print (groupByAnagrams ["opts", "alerting", "arrest", "bares", "drapes", "drawer", "emits", "least", "mate", "mates", "merit", "notes", "palest", "parses", "pores", "pots", "altering", "rarest", "baser", "parsed", "redraw", "items", "slate", "meat", "meats", "miter", "onset", "pastel", "passer", "poser", "spot", "integral", "raster", "bears", "rasped", "reward", "mites", "stale", "meta", "steam", "mitre", "steno", "petals", "spares", "prose", "stop", "relating", "raters", "braes", "spared", "warder", "smite", "steal", "tame", "tames", "remit", "stone", "plates", "sparse", "ropes", "tops", "triangle", "starer", "saber", "spread", "warred", "times", "tales", "team", "teams", "timer", "tones", "staple", "spears", "spore"])

**编辑!!!这是我的函数的重写版本。感谢 jrouquie 指出效率低下!** 10/7 再次编辑 - 为了清楚起见,在元组上使用了模式匹配,不需要所有那些 fsts 和 snds。

groupByAnagrams2 :: [String] -> [[String]]
groupByAnagrams2 str = groupBySnd $ map (\s -> (s, (sort s))) str

groupBySnd :: [(String, String)] -> [[String]]
groupBySnd []           = []
groupBySnd ((s1,s2):xs) = let (listOfAnagrams, listOfNonAnagramPairs) = (partitionBySnd s2 xs)
                          in
                          (s1:listOfAnagrams):(groupBySnd listOfNonAnagramPairs)


partitionBySnd :: String -> [(String, String)] -> ([String], [(String, String)])
partitionBySnd sortedStr []                = ([], [])
partitionBySnd sortedStr ((s, sSorted):ss)
              | (sortedStr == sSorted)     = let (listOfAnagrams, listOfNonAnagramPairs) = (partitionBySnd sortedStr ss)
                                             in
                                             (s:listOfAnagrams, listOfNonAnagramPairs)
              | otherwise                  = let (listOfAnagrams, listOfNonAnagramPairs) = (partitionBySnd sortedStr ss)
                                             in
                                             (listOfAnagrams, (s, sSorted):listOfNonAnagramPairs)
4

2 回答 2

6

我不得不说我发现你的 Haskell 代码有点笨拙。也就是说,您的原始函数可以写得更简洁;例如:

import Control.Arrow ((&&&))
import Data.Function (on)
import Data.List (groupBy, sortBy)

anagrams :: Ord a => [[a]] -> [[[a]]]
anagrams =
  map (map fst) .
  groupBy ((==) `on` snd) .
  sortBy (compare `on` snd) .
  map (id &&& sortBy compare)

那是:

  • map (id &&& sortBy compare)将列表中的每个字符串与其字符的排序列表配对;
  • sortBy (on compare snd)对您现在在其第二个组件上拥有的对列表进行排序,即已排序的字符列表;
  • groupBy (on (==) snd)将排序列表中具有相同排序字符列表的所有连续项目分组;
  • 最后,map (map fst)删除已排序字符的列表,只留下原始字符串。

例如:

Prelude> :m + Control.Arrow Data.Function Data.List

Prelude Control.Arrow Data.Function Data.List> ["foo", "bar", "rab", "ofo"]
["foo","bar","rab","ofo"]

Prelude Control.Arrow Data.Function Data.List> map (id &&& sortBy compare) it
[("foo","foo"),("bar","abr"),("rab","abr"),("ofo","foo")]

Prelude Control.Arrow Data.Function Data.List> sortBy (compare `on` snd) it
[("bar","abr"),("rab","abr"),("foo","foo"),("ofo","foo")]

Prelude Control.Arrow Data.Function Data.List> groupBy ((==) `on` snd) it
[[("bar","abr"),("rab","abr")],[("foo","foo"),("ofo","foo")]]

Prelude Control.Arrow Data.Function Data.List> map (map fst) it
[["bar","rab"],["foo","ofo"]]

“翻译”到 Caml 然后会给你留下一些类似的东西

let chars xs =
  let n = String.length xs in
  let rec chars_aux i =
    if i = n then [] else String.get xs i :: chars_aux (i + 1)
  in
  List.sort compare (chars_aux 0)

let group eq xs =
  let rec group_aux = function
    | [] -> []
    | [x] -> [[x]]
    | x :: xs ->
        let ((y :: _) as ys) :: yss = group_aux xs in
        if eq x y then (x :: ys) :: yss else [x] :: ys :: yss
  in
  group_aux xs

let anagrams xs =
  let ys = List.map chars xs in
  let zs = List.sort (fun (_,y1) (_,y2) -> compare y1 y2) (List.combine xs ys) in
  let zs = group (fun (_,y1) (_,y2) -> y1 = y2) zs in
  List.map (List.map fst) zs

在这里,辅助函数chars将字符串带入排序的字符列表,同时group应该让您了解如何在 Caml 中的列表上进行模式匹配。

于 2012-08-13T14:14:45.810 回答
4

模式匹配最通用的形式是表达式,与Haskellmatch中的表达式相同。case

let rec groupByAnagrams lst =
  match lst with [] -> ...
               | x::xs -> ...

但是,当只需要对函数的最后一个参数进行模式匹配时(就像这里的情况),有一个使用function语法的快捷方式:

let rec groupByAnagrams = function
    [] -> ...
  | x::xs -> ...

至于守卫,没有确切的等价物。您可以when在模式匹配中使用,但这仅适用于特定模式,并且您必须针对所需的所有情况重复该模式。你也可以使用if ... then ... else if ... then ... else ...,但这不是那么漂亮。

let rec partitionByAnagrams sortedStr = function
    [] -> ...
    x::xs when ...(some condition here)... -> ...
    x::xs -> ...
于 2012-08-09T04:59:34.073 回答