string - 显示在haskell中重复的单词列表

Question

我需要能够编写一个函数来显示字符串中的重复单词并按出现的顺序返回字符串列表并忽略非字母

例如在拥抱提示

repetitions :: String -> [String]

repetitions > "My bag is is action packed packed."
output> ["is","packed"]
repetitions > "My name  name name is Sean ."
output> ["name","name"]
repetitions > "Ade is into into technical drawing drawing ."
output> ["into","drawing"]

score 8 · Accepted Answer

要将字符串拆分为单词，请使用words函数（在 Prelude 中）。要消除非单词字符，filter使用Data.Char.isAlphaNum. 将列表与其尾部压缩在一起以获得相邻的对(x, y)。折叠列表，创建一个包含所有xwhere x==的新列表y。

有点像：

repetitions s = map fst . filter (uncurry (==)) . zip l $ tail l
  where l = map (filter isAlphaNum) (words s)

我不确定这是否有效，但它应该给你一个粗略的想法。

score 2 · Accepted Answer

我是这种语言的新手，所以我的解决方案在 Haskell 资深人士眼中可能有点丑陋，但无论如何：

let repetitions x = concat (map tail (filter (\x -> (length x) > 1) (List.group (words (filter (\c -> (c >= 'a' && c <= 'z') || (c>='A' && c <= 'Z') ||  c==' ') x)))))

这部分将从字符串s中删除所有非字母和非空格：

filter (\c -> (c >= 'a' && c <= 'z') || (c>='A' && c <= 'Z') ||  c==' ') s

这会将字符串s拆分为单词并将相同的单词分组到返回列表的列表中：

List.group (words s)

当这部分将删除所有少于两个元素的列表时：

filter (\x -> (length x) > 1) s

之后，我们会将所有列表连接到一个列表中，但从中删除一个元素

concat (map tail s)

score 1 · Accepted Answer

这可能是不雅的，但它在概念上非常简单。我假设它正在寻找像示例这样的连续重复单词。

-- a wrapper that allows you to give the input as a String
repititions :: String -> [String]
repititions s = repititionsLogic (words s)
-- dose the real work 
repititionsLogic :: [String] -> [String]
repititionsLogic [] = []
repititionsLogic [a] = []
repititionsLogic (a:as) 
    | ((==) a (head as)) = a : repititionsLogic as
    | otherwise = repititionsLogic as

score 0 · Accepted Answer

基于亚历山大·普罗科菲耶夫的回答：

repetitions x = concat (map tail (filter (\x -> (length x) > 1) (List.group (word (filter (\c -> (c >= 'a' && c <= 'z') || (c>='A' && c <= 'Z') || c==' ') x)))))

删除不必要的括号：

repetitions x = concat (map tail (filter (\x -> length x > 1) (List.group (word (filter (\c -> c >= 'a' && c <= 'z' || c>='A' && c <= 'Z' || c==' ') x)))))

使用 $ 删除更多括号（如果结束括号位于表达式的末尾，则每个 $ 可以替换一个左括号）：

repetitions x = concat $ map tail $ filter (\x -> length x > 1) $ List.group $ word $ filter (\c -> c >= 'a' && c <= 'z' || c>='A' && c <= 'Z' || c==' ') x

用 Data.Char 中的函数替换字符范围，合并 concat 和 map：

repetitions x = concatMap tail $ filter (\x -> length x > 1) $ List.group $ word $ filter (\c -> isAlpha c || isSeparator c) x

使用无点样式的部分和柯里化来简化(\x -> length x > 1) to ((>1) . length). 这与从右到左管道中的length(>1) （部分应用的运算符或section ）结合。

repetitions x = concatMap tail $ filter ((>1) . length) $ List.group $ word $ filter (\c -> isAlpha c || isSeparator c) x

消除显式 "x" 变量以使整体表达式无点：

repetitions = concatMap tail . filter ((>1) . length) . List.group . word . filter (\c -> isAlpha c || isSeparator c)

现在整个函数，从右到左读取，是一个管道，它只过滤字母或分隔符，将其拆分为单词，将其分成组，过滤具有超过 1 个元素的组，然后将剩余的组减少到第一个每个元素。

string - 显示在haskell中重复的单词列表

4 回答 4

Related

Reference